CASSANDRA-21209 Rework ZSTD dictionary compression logic to create a trainer per training#4667
CASSANDRA-21209 Rework ZSTD dictionary compression logic to create a trainer per training#4667smiklosovic wants to merge 5 commits intoapache:trunkfrom
Conversation
there is configuration parsing all over the place, I think it should be centralized and resolved from one code only
| } | ||
| finally | ||
| { | ||
| refViewFragment.close(); |
There was a problem hiding this comment.
I do not think what we did here was too smart (same concept was there before we started to selectAndReference) because training is done asychronously, so this method returns and finally is called before the sampling is actually finished. We should close in callback, as done above, or only in case we catch exception, as done here.
Or no?
trainer.trainDictionaryAsync(force).addCallback
This "addCallback" makes synchronous call from that? I do not think so, it just registers what should be done after it is finished, but it is not a blocking call, I guess.
ccbef18 to
596182d
Compare
| ScheduledExecutors.nonPeriodicTasks.submit(task); | ||
| try | ||
| { | ||
| trainer = ICompressionDictionaryTrainer.create(keyspaceName, tableName, compressionParams); |
There was a problem hiding this comment.
whole execution chain (from manager.train) does everything to postpone trainer creation until it is absolutely necessary and all is OK, as the instantiation of a trainer might be memory-wise very demanding (when max sample size is not trivial) as it allocates a direct ByteBuffer. We do not want to create a trainer allocating a big buffer just to throw it away if something else goes south.
Thanks for sending a pull request! Here are some tips if you're new here:
Commit messages should follow the following format:
The Cassandra Jira