Conversation
cdecker
left a comment
There was a problem hiding this comment.
I went through the first couple of commits, with a fine-toothed comb seeing that this is your first contribution. Don't worry we won't be as pedantic going forward :-)
I'll finish the review first thing tomorrow, but thought you might need the feedback early, so here's an incomplete review :-)
libs/gl-client/src/signer/mod.rs
Outdated
There was a problem hiding this comment.
This should likely be fatal, continuing here can get us into states we cannot recover from.
| "Signer state {}", | ||
| serde_json::to_string(&prestate).unwrap_or_else(|_| "<failed to serialize>".to_string()) | ||
| ); | ||
| log::trace!("Signer state {}", prestate_log); |
There was a problem hiding this comment.
And I notice that I am the one that introduced this non-fallible path 🤦
libs/gl-plugin/src/node/mod.rs
Outdated
There was a problem hiding this comment.
Which edge case are you referring to?
There was a problem hiding this comment.
You can ignore the TODOs from earlier commits, those were just implementation notes to myself while iterating. I’ve cleaned them up in the latest revision
libs/gl-plugin/src/node/mod.rs
Outdated
There was a problem hiding this comment.
This is the second time we do this rollup, let's add a method to the sketch that allows us to retrieve some stats in a repeatable way.
There was a problem hiding this comment.
Make it debug-printable and we have something concrete to search for in the logs for verification.
libs/proto/glclient/greenlight.proto
Outdated
There was a problem hiding this comment.
As in whether we should send the sketch across for verification on the counterparty's side? Could be a good idea temporarily to verify it works absolutely correct, and we can omit it later.
There was a problem hiding this comment.
Was thinking about extending the protos to handle deletions initially, but later went with the tombstone approach
Cargo.lock
Outdated
There was a problem hiding this comment.
WHere does this new dependency come from?
libs/gl-client/src/persist.rs
Outdated
There was a problem hiding this comment.
Why is this not 22? I might not be seeing things correctly, but I though line 668 sets the value as the version is 3 > 2 -> 22 should be set.
There was a problem hiding this comment.
To be precise I am assuming the following scenario: this diff instance is running on the gl-plugin, and the state contains (k2, 2, {"v": 2}), now the signer replies with the diff (k2, 3, {"v": 22}) which is set, so diffing the old state and the new state, I should be getting (k2, 3, {"v": 22}) as part of the patch that brings us from the per-state to the post-state. Or am I looking at the wrong side of the exchange. Comments as to what is intended can be useful to preempt these questions in future.
There was a problem hiding this comment.
Oh, I think I saw it now, you are checking the versions, not the values. Please check both.
There was a problem hiding this comment.
Done + added a helper function for better readability
Cargo.lock
Outdated
There was a problem hiding this comment.
And it gets removed again :-) A quick rebase should remove this altogether from the commit series.
cdecker
left a comment
There was a problem hiding this comment.
Very nice PR, I like it a lot. I think we might have had a small misunderstanding with the tombstone, in that I had in mind using the last valid version number to block any future updates on it, and sending that back and forth intrinsically making updates fail.
Also I'm not sure the conflicts can even happen, and what I think you were marking as conflict, is actually a normal thing to happen, i.e., a No-op following an old cached value being returned.
| self.put_tombstone(&node_key); | ||
| self.put_tombstone(&state_key); | ||
| self.put_tombstone(&allowlist_key); | ||
| self.put_tombstone(&tracker_key); |
There was a problem hiding this comment.
This is effectively writing 4 new keys, not touching the old keys if I see it correctly. By tombstone I meant more like (key, UINT64_MAX, NULL) making it impossible to ever overwrite in the future, and telling the counterparty that it is ok to forget the value, no newer value will ever be published.
There was a problem hiding this comment.
That'd also mean that no special handling needs to be implemented at all, obviating any tracking across unrelated keys.
Sorry for being so unprecise and making the tombstone sound like something bigger than it really is :-)
There was a problem hiding this comment.
Makes sense, just got carried away developing my little distributed DB lol. Simplified using the version based approach
| #[derive(Debug, Default)] | ||
| pub struct SafeMergeResult { | ||
| pub changes: Vec<(String, Option<u64>, u64)>, | ||
| pub conflict_count: usize, |
There was a problem hiding this comment.
No, you are counting overwrites with older versions as conflict, when really its not. The reason is that we may have multiple requests in flight, request A changes field A, and request B changes field B, if the signer B handles request B it'll return the A with the same version as before, which you would consider a conflict, but it is just stale, and merging will just pick the newer value. The staging takes care of ensuring that state is updated correctly, please copy those semantics where possible.
libs/gl-client/src/persist.rs
Outdated
There was a problem hiding this comment.
This also just falls away by making the tombstone just a special value (that behaves exactly like any other value, with the exception it is not updateable).
| trace!("hsmd hsm_id={} request processor started", hsm_id); | ||
| let mut last_sent_sketch = StateSketch::new(); | ||
| let mut last_seen_sync_epoch = state_sync_epoch.load(Ordering::SeqCst); | ||
| enum SyncMode { |
There was a problem hiding this comment.
Do we really need a full sync mode at all? As mentioned before the conflicts you identified are in fact no-ops, and therefore we don't end up falling back to full sync. The Diff mode will still do a full sync when the connection is first established simply because its sketch for what the signer already has is empty, so all kvv tuples are new from its PoV.
libs/gl-client/src/persist.rs
Outdated
There was a problem hiding this comment.
No need if tombstones are just normal values we can just omit them when sending the values to the signer during the initial sync. We'd still be storing them, but they'd not contribute to the inter-signer trafic,

Implements #745
Key design considerations