CASSANALYTICS-122 : Use long for absolute times and support C* 5.0 extended localDeletionTime#183
CASSANALYTICS-122 : Use long for absolute times and support C* 5.0 extended localDeletionTime#183skoppu22 wants to merge 10 commits intoapache:trunkfrom
Conversation
jyothsnakonisa
left a comment
There was a problem hiding this comment.
Thanks for working on this shailaja, looks good to me left few minor comments
There was a problem hiding this comment.
Do you wanna revert these changes or are these intentional?
There was a problem hiding this comment.
Intentional, CI pipelines failing otherwise.
cassandra-four-zero-bridge/src/test/java/org/apache/cassandra/spark/data/CqlTypeY2038Test.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This class looks very similar to "LocalTableSchemaStore" can you check if you can use that instead of adding this new class. You might make small modifications in LocalTableSchemaStore if required
There was a problem hiding this comment.
LocalTableSchemaStore and new Avro transformer tests are in different modules. LocalTableSchemaStore loads avsc files from class path which doesn't exists in this other modules. Hence using TestSchemaStore, which doesn't load any schema files, instead allows registering of schemas.
| CORE_MAX_PARALLEL_FORKS: 2 | ||
| CORE_TEST_MAX_HEAP_SIZE: "2048m" | ||
| CASSANDRA_USE_JDK11: <<parameters.use_jdk11>> | ||
| command: | | ||
| export GRADLE_OPTS="-Xmx2g -Dorg.gradle.jvmargs=-Xmx2g" | ||
| # Run compile/unit tests, skipping integration tests | ||
| ./gradlew --stacktrace clean assemble check -x cassandra-analytics-integration-tests:test -Dcassandra.analytics.bridges.sstable_format=<<parameters.sstable_format>> | ||
| ./gradlew --no-daemon --max-workers=2 --stacktrace clean assemble check -x cassandra-analytics-integration-tests:test -Dcassandra.analytics.bridges.sstable_format=<<parameters.sstable_format>> |
There was a problem hiding this comment.
Instead of making CI pipeline changes, can you try to tag the relevant tests with @Tag("Sequential")?
--no-daemon flag makes sense.
There was a problem hiding this comment.
- We don't know which tests causing this to tag them. Gradle getting killed while executing pipelines starting with spark3-* which run unit tests (gradlew check). Pipelines starting with int-* are running fine. This problem exists in trunk, not created by this PR. Running CI with --info didn't give any clues as it was bloating logs and CI killing gradle
- And we will have the same problem of finding the culprit everytime CI is broken
| public static Map<String, Long> getTTL(CdcEvent event) | ||
| { | ||
| CdcEvent.TimeToLive ttl = event.getTtl(); | ||
| if (ttl == null) | ||
| { | ||
| return null; | ||
| } | ||
| return mapOf(AvroConstants.TTL_KEY, ttl.ttlInSec, AvroConstants.DELETED_AT_KEY, ttl.expirationTimeInSec); | ||
| return mapOf(AvroConstants.TTL_KEY, (long) ttl.ttlInSec, AvroConstants.DELETED_AT_KEY, ttl.expirationTimeInSec); |
There was a problem hiding this comment.
Why converting TTL to long, meanwhile the schema has it as int?
The change is probably unnecessary.
There was a problem hiding this comment.
Here we are creating map, one entry as (TTL_KEY, TTL value as int), second entry (DELETED_AT_KEY, expirationTimeInSec value as long), so we have to take higher of int and long. Otherwise we need to return Map<String, Number> so Number can represent both int and long.
|
|
||
| @ParameterizedTest | ||
| @MethodSource("org.apache.cassandra.cdc.test.TestVersionSupplier#testVersions") | ||
| public void testBasicInsertAvroEncoding(CassandraVersion version) |
There was a problem hiding this comment.
The file adds a few irrelevant test cases. Improving test coverage is in general a good thing. So I am fine with those test cases in this file.
There was a problem hiding this comment.
We have updated two avro schema files, and there are no existing tests going through avro transformers. Hence had to add them. testMaxSupportedTtlAvroEncoding test below in this file verifies deletedAt is long and supporting max TTL as expected on both C* 4.x and 5.0
| maxParallelForks = Math.max(Runtime.runtime.availableProcessors() * 2, 8) | ||
| maxHeapSize = System.getenv('CORE_TEST_MAX_HEAP_SIZE') ?: '3072m' | ||
| maxParallelForks = System.getenv('CORE_MAX_PARALLEL_FORKS')?.toInteger() | ||
| ?: Math.max(Runtime.runtime.availableProcessors() * 2, 8) |
There was a problem hiding this comment.
Please try out the Sequential test tag first.
There was a problem hiding this comment.
As I mentioned above, pipelines broken on trunk, we don't know which test caused that, gradlew check is getting killed on CI
| Dataset<Row> preExpiry = bulkReaderDataFrame(UDT_TTL_TABLE).load(); | ||
| assertThat(preExpiry.collectAsList()).hasSize(ROW_COUNT); | ||
|
|
||
| Uninterruptibles.sleepUninterruptibly(80, TimeUnit.SECONDS); |
There was a problem hiding this comment.
Please find a different way and avoid sleep the test for 80 seconds. We cannot have the CI sleep for 80 seconds.
If there is no other way, please delete all the tests in this file. They are irrelevant to the patch anyway.
There was a problem hiding this comment.
These tests are relevant. CqlSet, CqlMap etc are implemented in C* 4.0 bridge and extended (i.e, reused) in C* 5.0 bridge. While 4.0 bridge expects int, 5.0 bridge expects long. I have modified these files below to use CqlType's method which is implemented separately for both bridges. And there are no existing tests for collections TTL going though this modified code path.
| * Tests verifying Y2038 boundary behavior for {@link CqlType#tombstone} and {@link CqlType#expiring} | ||
| * in Cassandra 4.0. | ||
| */ | ||
| class CqlTypeY2038Test |
There was a problem hiding this comment.
I do not think this test is valuable. Analytics/CDC does not set TTL, it only reads the TTL values from Cassandra. The TTL values are guaranteed to be valid already; otherwise the cql write fails already. Please delete this file.
There was a problem hiding this comment.
We are just ensuring checkedCast code we added indeed complaining when time indeed exceeds the range where int cannot accommodate. But the test can be removed as we are sure checkedCast does the job.
| rowBuilder.addCell(BufferCell.expiring(cd, timestamp, ttl, now, type().serialize(o), | ||
| CellPath.create(ByteBuffer.wrap(UUIDGen.getTimeUUIDBytes())))); | ||
| rowBuilder.addCell(CqlType.expiring(cd, timestamp, ttl, now, type().serialize(o), | ||
| CellPath.create(ByteBuffer.wrap(UUIDGen.getTimeUUIDBytes())))); | ||
| } |
There was a problem hiding this comment.
I think this change can be reverted once you drop CqlTypeY2038Test and revert the change in CqlType
There was a problem hiding this comment.
BufferCell.expiring in 4.0 bridge expects int for nowInSec, 5.0 bridge expects long. CqlList implemented in C* 5.0 bridge doesn't override this method to pass long. Hence I have modified this to use CqlType's method which is implemented separately for both bridges and uses appropriate type in corresponding bridges. Without this change, we will have to override this method again in 5.0 bridge to pass long value.
| rowBuilder.addCell(BufferCell.expiring(cd, timestamp, ttl, now, valueType().serialize(entry.getValue()), | ||
| CellPath.create(keyType().serialize(entry.getKey())))); | ||
| rowBuilder.addCell(CqlType.expiring(cd, timestamp, ttl, now, valueType().serialize(entry.getValue()), | ||
| CellPath.create(keyType().serialize(entry.getKey())))); |
There was a problem hiding this comment.
CqlMap is implemented in C* 4.0 bridge and not extended (i.e, reused) in C* 5.0 bridge. While 4.0 bridge expects int for nowInSec, 5.0 bridge expects long for the same. I have modified this to use CqlType's method which is implemented separately for both bridges and uses appropriate type in corresponding bridges. Without this change, we will have to implement CqlMap again in 5.0 bridge and override this function to pass long value.
| rowBuilder.addCell(BufferCell.expiring(cd, timestamp, ttl, now, ByteBufferUtil.EMPTY_BYTE_BUFFER, | ||
| CellPath.create(type().serialize(o)))); | ||
| rowBuilder.addCell(CqlType.expiring(cd, timestamp, ttl, now, ByteBufferUtil.EMPTY_BYTE_BUFFER, | ||
| CellPath.create(type().serialize(o)))); |
There was a problem hiding this comment.
Same as explained above for CqlMap
CHANGES.txt
Outdated
| @@ -1,5 +1,6 @@ | |||
| 0.4.0 | |||
| ----- | |||
| * Fix year 2038 problem using long for absolute times and support C* 5.0 extended localDeletionTime | |||
There was a problem hiding this comment.
The year 2038 problem sounds cool, but I feel it shades the actual change in this patch.
I would revise the change log entry to "Support extended deletion time in CDC for Cassandra 5.0" and update the JIRA title too.
deletedAtin Avro schema to longCI run here : https://app.circleci.com/pipelines/github/skoppu22/cassandra-analytics/75/workflows/f6d5f03b-8373-4549-8daf-5f776f78ff96