Redesign MRP kick: abort exchange and restart fresh#3399
Redesign MRP kick: abort exchange and restart fresh#3399
Conversation
When DNS-SD resolves a peer address, PeerConnection calls exchange.kick() to short-circuit the MRP backoff timer and retransmit immediately. The kick checks whether the retransmission timer is currently running, and if not, does nothing. There is a race in Node.js's timer phase: the retransmission timer and the DNS-SD notify setTimeout(0) both expire at roughly the same time. The retransmission timer fires first, stopping itself and starting an async channel.send(). The notify fires next, the kick arrives, sees the timer is not running (channel.send() is still in flight), and is silently dropped. When channel.send() completes, #initializeResubmission starts a fresh 2-minute backoff — ignoring that a kick was requested. Fix by adding a #kickPending flag. When kick() fires and the timer is not running, we set the flag instead of doing nothing. #initializeResubmission checks the flag after channel.send() completes and, if set, starts the next timer with zero delay so the retransmission fires immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes a race in MRP retransmission “kick” handling within MessageExchange so that a kick arriving while a retransmission send is in-flight is not silently dropped, and instead forces the next resubmission to occur immediately.
Changes:
- Add
#kickPendingto remember kicks that arrive when the retransmission timer isn’t running (send in-flight). - Clear
#kickPendingon message boundary (send()finally) and consume it in#initializeResubmission()to start the next timer with zero delay.
You can also share your feedback on Copilot code review. Take the survey.
…circuiting timer
Instead of calling exchange.kick() to short-circuit the MRP retransmission timer,
a kick now aborts the current CASE handshake exchange via a local abort signal and
lets connect() loop immediately to a fresh exchange with a clean MRP backoff (~2.25s
first retry). This means a DNS-SD address discovery or an explicit peer.kick() takes
effect within seconds instead of waiting out the current (potentially many-minute)
MRP backoff interval.
Two KickOrigin categories ("discover" and "connect") control separate rate limits
via PeerTimingParameters:
- "discover" (DNS-SD triggered): 30 min minimum between restarts
- "connect" (user/explicit): 10 min minimum between restarts
The QuietObservable on the kicker still provides a 3-second debounce as first-line
defense against rapid mDNS churn before the per-origin rate limit is consulted.
The exchange.kick() method and its supporting #kick field in MessageExchange are
removed entirely — kick logic now lives at the PeerConnection layer via localAbort.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| * allowing the new address to be used immediately. This threshold prevents mDNS churn from | ||
| * restarting the handshake too frequently. | ||
| */ | ||
| mrpKickRestartIntervalDiscover: Duration; |
There was a problem hiding this comment.
This is getting pretty esoteric :)
Trying to think how we could make these more comprehensible, maybe something like:
retryBackoffResetWindow: {
connect?: Duration,
addressChange?: Duration,
}...but I acknowledge this is no more likely to be understood :)
Summary
Replaces the MRP kick mechanism with a cleaner architectural approach:
isRunning === false).localAbortsignal and letsconnect()loop immediately to a fresh exchange with a clean MRP backoff (~2.25s first retry). DNS-SD address discovery or an explicitpeer.kick()now takes effect within seconds instead of waiting out a potentially many-minute backoff.Key changes
MessageExchange: Removedexchange.kick()method and the#kick/#kickPendingfields entirely — kick logic now lives at thePeerConnectionlayerPeerConnection: AddedKickOrigintype ("discover"|"connect"),lastRestartAtshared state, andlocalAbort-based restart logic inattemptOnce()PeerTimingParameters: AddedmrpKickRestartIntervalDiscover(30 min) andmrpKickRestartIntervalConnect(10 min) rate-limit thresholdsPeer: UpdatedQuietObservable<[KickOrigin]>typing andConnectionProcessinterface to thread origin throughRate limiting (two layers)
QuietObservable(3s debounce,skipSuppressedEmits: true): First-line defense against rapid-fire mDNS churnlastRestartAtthreshold: Shared across all concurrent address attempts so at most one restart fires per kickDesign notes
localAbortwrapsaddressAbort(notoverallAbort) — firing it aborts only the currentpair()call, not the entire address retry looplastRestartAtis scoped to thePeerConnectioncall, shared across concurrent address attempts to prevent burst restartspair()succeeds,overallAbort()has already fired, makinglocalAbort()harmlessTest plan
PeerConnectionKickTest.tscovering:🤖 Generated with Claude Code