feat: WebSocket reconnection and session recovery (bd-critical-001)#294
feat: WebSocket reconnection and session recovery (bd-critical-001)#294khaliqgant wants to merge 2 commits intomainfrom
Conversation
- Add exponential backoff reconnection (1s, 2s, 4s, 8s, max 30s) - Implement session ID persistence in localStorage - Add message queue for offline sends - Add reconnect indicator in UI (Header component) - Server-side session tracking and message history - Sync missed messages on reconnect Implements bd-critical-001 launch blocker.
🤖 My Senior Dev — Analysis Complete👤 For @khaliqgant📁 Expert in View your contributor analytics → 📊 10 files reviewed • 4 high risk • 4 need attention 🚨 High Risk:
🚀 Open Interactive Review →The full interface unlocks features not available in GitHub:
💬 Chat here: 📖 View all 12 personas & slash commandsYou can interact with me by mentioning In PR comments or on any line of code:
Slash commands:
AI Personas (mention to get their perspective):
For the best experience, view this PR on myseniordev.com — includes AI chat, file annotations, and interactive reviews. |
| } else { | ||
| // Regular data update | ||
| const dashboardData = parsed as DashboardData; | ||
| setData(dashboardData); | ||
| // Update last message ID if available in the data | ||
| if (dashboardData.messages && dashboardData.messages.length > 0) { | ||
| const lastMsg = dashboardData.messages[dashboardData.messages.length - 1]; | ||
| if (lastMsg.id) { | ||
| lastMessageIdRef.current = lastMsg.id; | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 Client doesn't handle session_init response type, causing data extraction failure
When a new session is created, the server sends a response with type: 'session_init' containing data and lastMessageId fields. However, the client only explicitly handles session_restored and missed_messages types. The session_init response falls through to the else branch which incorrectly treats the entire message object as DashboardData.
Click to expand
Server sends (line 2252-2256):
ws.send(JSON.stringify({
type: isReconnection ? 'session_restored' : 'session_init',
data: initialData,
lastMessageId: sessionState.lastMessageId,
}));Client handles (lines 235-254):
if (parsed.type === 'missed_messages' ...) {
// handles missed_messages
} else if (parsed.type === 'session_restored') {
// handles session_restored - extracts parsed.data
} else {
// Falls through here for 'session_init'
const dashboardData = parsed as DashboardData; // WRONG!
setData(dashboardData);
// Tries to access dashboardData.messages which doesn't exist
}Impact:
For new sessions (not reconnections), the client receives { type: 'session_init', data: {...}, lastMessageId: ... } but treats the whole object as dashboard data. This means:
setData()is called with the wrong structure- The
lastMessageIdis not tracked (line 249-253 tries to accessdashboardData.messageswhich is undefined) - The actual dashboard data in
parsed.datais never extracted
Recommendation: Add explicit handling for session_init type similar to session_restored:
} else if (parsed.type === 'session_init') {
if (parsed.lastMessageId) {
lastMessageIdRef.current = parsed.lastMessageId;
}
if (parsed.data) {
setData(parsed.data as DashboardData);
}
} else {Was this helpful? React with 👍 or 👎 to provide feedback.
| const sessions = new Map<string, SessionState>(); | ||
| const wsToSessionId = new WeakMap<WebSocket, string>(); | ||
| const MAX_MISSED_MESSAGES = 100; // Keep last 100 messages per session |
There was a problem hiding this comment.
🔴 Server-side session Map grows unbounded - no TTL cleanup mechanism exists
The sessions Map stores session state for reconnection recovery, but there is no cleanup mechanism to remove stale sessions. The comment at line 2304 claims sessions "will be cleaned up by TTL" but no such TTL cleanup exists.
Click to expand
Session creation (line 2224):
sessions.set(sessionId, sessionState);On disconnect (lines 2301-2305):
if (sessionId) {
wsToSessionId.delete(ws);
// Keep session state for potential reconnection (will be cleaned up by TTL)
}Missing cleanup:
Unlike the cloud persistence handler which has a cleanup interval (lines 72-84 in initCloudPersistence), the WebSocket session tracking has no such mechanism.
Impact:
- Each unique session ID creates a new entry in the
sessionsMap - Session IDs are persisted in localStorage and reused, but if users clear localStorage or use different browsers, new session IDs are generated
- Each session stores up to 100 messages (
MAX_MISSED_MESSAGES), each containing full dashboard data - Over time, this causes unbounded memory growth on long-running servers
Recommendation: Add a TTL cleanup interval similar to the cloud persistence handler:
const SESSION_TTL_MS = 24 * 60 * 60 * 1000; // 24 hours
const sessionCleanupInterval = setInterval(() => {
const now = Date.now();
for (const [sessionId, state] of sessions.entries()) {
if (now - state.lastConnectedAt.getTime() > SESSION_TTL_MS) {
sessions.delete(sessionId);
}
}
}, 5 * 60 * 1000); // Every 5 minutes
sessionCleanupInterval.unref();Was this helpful? React with 👍 or 👎 to provide feedback.
| // Exponential backoff: 1s, 2s, 4s, 8s, max 30s | ||
| const baseDelay = 1000; // 1 second | ||
| const delay = Math.min( | ||
| opts.reconnectDelay * Math.pow(2, reconnectAttemptsRef.current), | ||
| 30000 | ||
| baseDelay * Math.pow(2, reconnectAttemptsRef.current), | ||
| 30000 // max 30 seconds | ||
| ); |
There was a problem hiding this comment.
🟡 opts.reconnectDelay option is ignored - hardcoded base delay used instead
The reconnectDelay option from UseWebSocketOptions is included in the useCallback dependency array but is never actually used in the reconnection logic. The code hardcodes baseDelay = 1000 instead.
Click to expand
Option definition (line 43):
const DEFAULT_OPTIONS: Required<UseWebSocketOptions> = {
...
reconnectDelay: 1000,
};Actual usage (lines 200-205):
// Exponential backoff: 1s, 2s, 4s, 8s, max 30s
const baseDelay = 1000; // 1 second // <-- HARDCODED!
const delay = Math.min(
baseDelay * Math.pow(2, reconnectAttemptsRef.current),
30000
);Dependency array (line 266):
}, [opts.url, opts.reconnect, opts.maxReconnectAttempts, opts.reconnectDelay]);Impact:
Users cannot customize the base reconnection delay - passing reconnectDelay: 5000 has no effect. The option is misleading as it appears to be configurable but is actually ignored.
Recommendation: Use opts.reconnectDelay instead of hardcoded value:
const delay = Math.min(
opts.reconnectDelay * Math.pow(2, reconnectAttemptsRef.current),
30000
);Was this helpful? React with 👍 or 👎 to provide feedback.
|
|
||
| let sessionId = localStorage.getItem(SESSION_ID_KEY); | ||
| if (!sessionId) { | ||
| sessionId = `session-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`; |
Check failure
Code scanning / CodeQL
Insecure randomness High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
In general, fix the problem by replacing Math.random() with a cryptographically secure random source (crypto.getRandomValues in the browser). Generate enough random bytes and encode them (for example, base36/hex) to get a reasonably long, unpredictable session identifier, then keep the rest of the logic (storing into localStorage, etc.) unchanged.
Concretely, within getOrCreateSessionId in packages/dashboard/ui/react-components/hooks/useWebSocket.ts, keep the Date.now() prefix if desired for debugging, but replace the Math.random().toString(36).substring(2, 9) fragment with a helper that uses window.crypto.getRandomValues. We can add a small local helper generateSecureRandomString(length: number): string above getOrCreateSessionId that fills a Uint8Array with secure random bytes and maps each nibble (4 bits) to base16 or base36 characters until the requested length is reached. Then build sessionId as e.g. `session-${Date.now()}-${generateSecureRandomString(16)}`. No external dependencies are required; window.crypto is standard in modern browsers, and the existing environment check typeof window === 'undefined' already protects from server-side execution.
| @@ -48,6 +48,30 @@ | ||
| const MESSAGE_QUEUE_KEY = 'agent-relay-ws-message-queue'; | ||
|
|
||
| /** | ||
| * Generate a cryptographically secure random string using window.crypto. | ||
| */ | ||
| function generateSecureRandomString(length: number): string { | ||
| if (typeof window === 'undefined' || !window.crypto || !window.crypto.getRandomValues) { | ||
| // Fallback: this should only be hit in non-browser environments where the session ID is unused. | ||
| return Math.random().toString(36).substring(2, 2 + length); | ||
| } | ||
|
|
||
| const bytes = new Uint8Array(length); | ||
| window.crypto.getRandomValues(bytes); | ||
|
|
||
| const chars = 'abcdefghijklmnopqrstuvwxyz0123456789'; | ||
| const charsLength = chars.length; | ||
| let result = ''; | ||
|
|
||
| for (let i = 0; i < bytes.length; i++) { | ||
| // Use modulo to map random byte to character index; slight bias is acceptable for identifiers. | ||
| result += chars[bytes[i] % charsLength]; | ||
| } | ||
|
|
||
| return result; | ||
| } | ||
|
|
||
| /** | ||
| * Generate or retrieve session ID from localStorage | ||
| */ | ||
| function getOrCreateSessionId(): string { | ||
| @@ -57,7 +81,7 @@ | ||
|
|
||
| let sessionId = localStorage.getItem(SESSION_ID_KEY); | ||
| if (!sessionId) { | ||
| sessionId = `session-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`; | ||
| sessionId = `session-${Date.now()}-${generateSecureRandomString(16)}`; | ||
| localStorage.setItem(SESSION_ID_KEY, sessionId); | ||
| } | ||
| return sessionId; |
| } else { | ||
| // Client is disconnected - store message for session if exists | ||
| const sessionId = wsToSessionId.get(client); | ||
| if (sessionId) { | ||
| const sessionState = sessions.get(sessionId); | ||
| if (sessionState) { | ||
| sessionState.messages.push({ | ||
| id: messageId, | ||
| data: JSON.parse(payload), | ||
| timestamp, | ||
| }); | ||
|
|
||
| // Keep only last MAX_MISSED_MESSAGES messages | ||
| if (sessionState.messages.length > MAX_MISSED_MESSAGES) { | ||
| sessionState.messages.shift(); | ||
| } | ||
|
|
||
| sessionState.lastMessageId = messageId; | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 Dead code: storing messages for disconnected clients will never execute
The code that attempts to store messages for disconnected clients in broadcastData is unreachable because wss.clients only contains connected clients.
Click to expand
Issue
In broadcastData() at lines 2072-2091, there's an else branch that tries to store messages for disconnected clients:
} else {
// Client is disconnected - store message for session if exists
const sessionId = wsToSessionId.get(client);
...
}However, wss.clients is a Set that only contains connected WebSocket clients. Once a client disconnects, it's automatically removed from this Set. Therefore, the else branch (when client.readyState !== WebSocket.OPEN) will never execute for truly disconnected clients.
Actual vs Expected
- Actual: Messages are never stored for disconnected clients because they're not in
wss.clients. - Expected: If the intent is to queue messages for reconnecting clients, the logic needs to be restructured to track sessions independently of the WebSocket client set.
Impact
The missed message recovery feature won't work as intended. Clients that disconnect and reconnect won't receive messages that were broadcast while they were disconnected.
Was this helpful? React with 👍 or 👎 to provide feedback.
Implements WebSocket reconnection with exponential backoff and session recovery.
Fixes bd-critical-001 launch blocker.