Skip to content

CASSJAVA-104: Fix Flaky tests#2045

Open
SiyaoIsHiding wants to merge 27 commits intoapache:4.xfrom
SiyaoIsHiding:flaky-test
Open

CASSJAVA-104: Fix Flaky tests#2045
SiyaoIsHiding wants to merge 27 commits intoapache:4.xfrom
SiyaoIsHiding:flaky-test

Conversation

@SiyaoIsHiding
Copy link
Contributor

@SiyaoIsHiding SiyaoIsHiding commented Jul 2, 2025

Ready for review

@SiyaoIsHiding SiyaoIsHiding changed the title Flaky tests CASSJAVA-104: Fix Flaky tests Jul 15, 2025
Comment on lines +86 to +92
protected CqlPrepareAsyncProcessor(
Optional<? extends DefaultDriverContext> context, CacheBuilder<Object, Object> cacheBuilder) {
this.cache = cacheBuilder.build();
context.ifPresent(
(ctx) -> {
LOG.info("Adding handler to invalidate cached prepared statements on type changes");
EventExecutor adminExecutor = ctx.getNettyOptions().adminEventExecutorGroup().next();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@absurdfarce
CacheBuilder doesn't have a strongValues() method, so I guess our proposed solution about putting strongValues() to override the weak values wouldn't work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there's an easier way to make that happen that doesn't involve introducing a parallel method to the one we've already exposed:

diff --git a/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java b/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
index a3d11cff0..7948931f0 100644
--- a/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
+++ b/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
@@ -64,14 +64,18 @@ public class CqlPrepareAsyncProcessor
   }
 
   public CqlPrepareAsyncProcessor(@NonNull Optional<? extends DefaultDriverContext> context) {
-    this(context, Functions.identity());
+      // In the default case we mark our underlying cache as using weak values
+    this(context, builder->builder.weakValues());
   }
 
   protected CqlPrepareAsyncProcessor(
       Optional<? extends DefaultDriverContext> context,
       Function<CacheBuilder<Object, Object>, CacheBuilder<Object, Object>> decorator) {
 
-    CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder().weakValues();
+      // Note that the base cache does NOT use weak values like the one-arg constructor above does!  If
+      // you provide a decorator function for the CacheBuilder you must call weakValues() in that function
+      // if you want your cache to use weak values.
+    CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder();
     this.cache = decorator.apply(baseCache).build();
     context.ifPresent(
         (ctx) -> {

@tolbertam tolbertam self-requested a review February 23, 2026 18:54
Copy link
Contributor

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me; several great finds in here @SiyaoIsHiding! A couple of relatively minor changes I'd kinda like to see but overall I'm quite positive on where this PR is headed.

Comment on lines +86 to +92
protected CqlPrepareAsyncProcessor(
Optional<? extends DefaultDriverContext> context, CacheBuilder<Object, Object> cacheBuilder) {
this.cache = cacheBuilder.build();
context.ifPresent(
(ctx) -> {
LOG.info("Adding handler to invalidate cached prepared statements on type changes");
EventExecutor adminExecutor = ctx.getNettyOptions().adminEventExecutorGroup().next();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there's an easier way to make that happen that doesn't involve introducing a parallel method to the one we've already exposed:

diff --git a/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java b/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
index a3d11cff0..7948931f0 100644
--- a/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
+++ b/core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlPrepareAsyncProcessor.java
@@ -64,14 +64,18 @@ public class CqlPrepareAsyncProcessor
   }
 
   public CqlPrepareAsyncProcessor(@NonNull Optional<? extends DefaultDriverContext> context) {
-    this(context, Functions.identity());
+      // In the default case we mark our underlying cache as using weak values
+    this(context, builder->builder.weakValues());
   }
 
   protected CqlPrepareAsyncProcessor(
       Optional<? extends DefaultDriverContext> context,
       Function<CacheBuilder<Object, Object>, CacheBuilder<Object, Object>> decorator) {
 
-    CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder().weakValues();
+      // Note that the base cache does NOT use weak values like the one-arg constructor above does!  If
+      // you provide a decorator function for the CacheBuilder you must call weakValues() in that function
+      // if you want your cache to use weak values.
+    CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder();
     this.cache = decorator.apply(baseCache).build();
     context.ifPresent(
         (ctx) -> {

session.execute(
"CREATE TABLE test_table_caching_1 (e int primary key, f frozen<test_type_caching_1>)");
session.execute(
"CREATE TABLE test_table_caching_2 (g int primary key, h frozen<test_type_caching_2>)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, former me 🤦

Should the names of the tables and types created here not follow the basic/collection/tuple/nested split we use for setupCacheEntryTest* method names throughout? Seems like this should be:

        session.execute("CREATE TYPE test_type_basic_1 (a text, b int)");
        session.execute("CREATE TYPE test_type_basic_2 (c int, d text)");
        session.execute("CREATE TABLE test_table_basic_1 (e int primary key, f frozen<test_type_1>)");
        session.execute("CREATE TABLE test_table_basic_2 (g int primary key, h frozen<test_type_2>)");

and so on if we really want to avoid collisions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidationResultSetTest, invalidationVariableDefsTest, and invalidationTestInner are shared among basic/collection/tuple/nested. They need to be refactored if we want to change the table names to specify basic/collection/tuple/nested. I don't think it's worth it, cuz test methods inside of one test suite won't run in parallel anyway

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But don't we have to change the table name since these tests don't clean up after themselves? Once you execute the tests for, say, basic you've already created test_type_1, test_type_2 and all the rest. Won't you have a conflict then if you try to create those types against the same Cassandra backend (since they were never removed)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite has a @Rule instead of a @ClassRule, that means for every test method, CCM bridge creates a separate keyspace and session.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, is that how that works? I remembered that working differently. My recollection was that CcmBridge just controlled the ccm interaction and that the session rule was the entity that constrained ops to a given keyspace.

Yeah, as I read this SessionRule can create a new keyspace if the user specifies it in the builder but this IT does not currently do so and this PR doesn't change that behaviour.

I do agree we need a new keyspace per test (since without that the create type calls will definitely stomp on each other) but I don't think we're quite there yet are we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as I look at this again... why would the changes to this test do anything to fix the underlying problem? With the changes in this PR you still have exactly the same name collisions we had in the previous test... now everything just collides with test_table_caching or test_type_caching instead of test_table and test_type.

Setting aside the question of session isolation (which I still don't think we have right) why... is this better?

Copy link
Contributor

@absurdfarce absurdfarce Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I get it now... the problem was that it was conflicting with UdtCodecIT, not that it was conflicting with the four variations of the test executed by this IT. 🤦

I completely misunderstood the problem... I'm good with this now.

public TestCqlPrepareAsyncProcessor(@NonNull Optional<DefaultDriverContext> context) {
// Default CqlPrepareAsyncProcessor uses weak values here as well. We avoid doing so
// to prevent cache entries from unexpectedly disappearing mid-test.
super(context, CacheBuilder.newBuilder());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would become:

   super(context, Functions.identity())

given the change referenced above.

@SiyaoIsHiding
Copy link
Contributor Author

PR updated except the "basic/collection/tuple/nested" in table names


public static UrlProvisionOption driverQueryBuilderBundle() {
return bundle("reference:file:../query-builder/target/classes");
return bundle("reference:file:../query-builder/target/classes").startLevel(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing that deferring these later than the shaded bundle (and other deps) allows these to more reliably depend on the guava dep being there?

Comment on lines +66 to +76
this(context, CacheBuilder::weakValues);
}

protected CqlPrepareAsyncProcessor(
Optional<? extends DefaultDriverContext> context,
Function<CacheBuilder<Object, Object>, CacheBuilder<Object, Object>> decorator) {

CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder().weakValues();
// CASSJAVA-104
// Note that the base cache does NOT use weak values like the one-arg constructor it previously
// does!
CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this(context, CacheBuilder::weakValues);
}
protected CqlPrepareAsyncProcessor(
Optional<? extends DefaultDriverContext> context,
Function<CacheBuilder<Object, Object>, CacheBuilder<Object, Object>> decorator) {
CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder().weakValues();
// CASSJAVA-104
// Note that the base cache does NOT use weak values like the one-arg constructor it previously
// does!
CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder();
// Use weakValues to evict prepared statements from the cache as soon are they are
// no longer referenced elsewhere.
this(context, CacheBuilder::weakValues);
}
protected CqlPrepareAsyncProcessor(
Optional<? extends DefaultDriverContext> context,
Function<CacheBuilder<Object, Object>, CacheBuilder<Object, Object>> decorator) {
CacheBuilder<Object, Object> baseCache = CacheBuilder.newBuilder();

Was a little confused by this. It looks like the 1-argument constructor is used by the codebase, and the 2 argument constructor is used by tests, by moving the weakValues to the 1arg constructor, this allows the tests to work around the cache values being evicted as soon as they are no longer referenced elsewhere.

Think it would be better to just add a comment in the 1-arg constructor where weakValue is declared so it will be more clear why it's used there.

Copy link
Contributor

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks (again) for taking this on @SiyaoIsHiding!

Similar to what I said on the CONTRIBUTING updates: I'm good with the state of this right now. Holding on giving an explicit 👍 until @tolbertam confirms that he's also good with everything as it stands.

import org.junit.rules.TestRule;
import org.junit.runner.RunWith;

@RunWith(DataProviderRunner.class)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remind myself: add that @Category(ParallelizableTests.class) back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants