fix(channel): direct relay retrigger results to prevent truncated replies#367
fix(channel): direct relay retrigger results to prevent truncated replies#367lutr0 wants to merge 2 commits intospacedriveapp:mainfrom
Conversation
…lies When a branch/worker completes and the retrigger relay LLM pass runs, the model sometimes generates only a teaser (e.g. "Hey — here's where we stand:") instead of relaying the full result. This leaves users seeing truncated responses. Two fixes: 1. Direct relay mode: skip the second LLM pass entirely and send branch/worker results directly to the user. For single successful results, the raw output is sent as-is. For multiple results or failures, a structured summary is composed. This eliminates the truncation problem at its source. 2. Truncation guardrail (defense in depth): if the LLM relay path is still reached and produces a suspiciously short reply (<80 chars) while the queued summary is substantial (>200 chars), automatically send the full summary as a fallback.
WalkthroughAdds deterministic direct-relay branches for retrigger results: short-circuiting LLM turns to emit pending retrigger summaries directly (with logging and error handling) and a guardrail to fallback to direct-relay when summary length vs last assistant message meets thresholds. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/agent/channel.rs (3)
1721-1721: Consider extracting truncation thresholds as named constants.The magic numbers
80and200encode important heuristics but lack documentation. Named constants would clarify intent and make future tuning easier.+const TRUNCATED_REPLY_THRESHOLD_CHARS: usize = 80; +const SUBSTANTIAL_SUMMARY_THRESHOLD_CHARS: usize = 200; + // In the is_retrigger block: -if last_assistant_len < 80 && summary.chars().count() > 200 { +if last_assistant_len < TRUNCATED_REPLY_THRESHOLD_CHARS + && summary.chars().count() > SUBSTANTIAL_SUMMARY_THRESHOLD_CHARS {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` at line 1721, Extract the numeric truncation thresholds into well-named constants (e.g., MIN_LAST_ASSISTANT_LEN = 80 and SUMMARY_TRUNCATE_THRESHOLD = 200) and replace the magic numbers in the condition that uses last_assistant_len and summary.chars().count() (the if in src/agent/channel.rs) with those constants; declare the constants near the top of the module or next to related functions (so intent and tunability are clear) and add a short comment explaining each constant's purpose.
2754-2773: Direct relay sends raw worker output for single successful results.For a single successful result, the raw
r.resultis sent directly without any framing. This could include very long or unformatted output that wasn't intended to be user-facing.Consider whether single-result output should also get minimal framing (e.g., a brief header) to maintain consistency with the multi-result format, or if the raw relay is intentional for simplicity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 2754 - 2773, The current mapping closure over map(|(idx, r)| { ... }) sends raw r.result when result_count == 1 and r.success, risking unframed/unfriendly output; change the single-result branch in that closure so even for a single successful result you emit a minimal framed message (for example include idx+1, r.process_type and status as a short header and then r.result) to match the multi-result format and keep output consistent with the else branch that formats "Update {} ({}, {}):\n{}".
2748-2817: Direct relay effectively deprecates the LLM-based retrigger path.The direct relay path will execute whenever
pending_resultsis non-empty (which is guaranteed by the check at line 2736). The only way to reach the LLM-based retrigger code starting at line 2819 is if all pending results have emptyresultstrings — an unlikely edge case.This is a significant behavior change: users will receive raw worker/branch output instead of LLM-synthesized summaries. This aligns with the PR objective of avoiding truncated teaser replies, but consider:
- Adding a comment explaining when the LLM path would be reached
- For multiple results, users now receive all outputs concatenated rather than a summarized relay
The history handling is correct —
pop_retrigger_bridge_messageis called before pushing the new assistant message, maintaining the invariant against consecutive assistant messages.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 2748 - 2817, The direct-relay branch using pending_results -> direct_relay_text will always win when any pending_results contain non-empty result strings, effectively bypassing the LLM retrigger path; add a clear comment above the direct_relay_text construction explaining that the LLM path is only reached when all pending_results have empty result strings and why that is acceptable, and then change the multi-result handling to either produce a concise summary instead of raw concatenation or gate raw concatenation behind a flag: introduce/ call a summarization helper (e.g., summarize_pending_results) when result_count > 1 (or consult a config flag) so users don’t get unfiltered verbose outputs, and keep the existing history handling (pop_retrigger_bridge_message, push Assistant message), response_tx send, and retrigger_count/pending_* clearing logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/channel.rs`:
- Around line 1721-1748: When the truncation fallback sends the full summary to
the user (the branch that calls self.state.conversation_logger.log_bot_message
and self.response_tx.send(OutboundResponse::Text(summary.to_string())), you must
also update the conversation history the same way the else branch does so future
LLM turns see the full summary; mirror the history-injection logic from the else
branch (the same call(s) that the else path uses to persist/append the preserved
assistant reply into memory) immediately after a successful send, and handle any
errors similarly to maintain consistency between what the user saw and what is
stored.
---
Nitpick comments:
In `@src/agent/channel.rs`:
- Line 1721: Extract the numeric truncation thresholds into well-named constants
(e.g., MIN_LAST_ASSISTANT_LEN = 80 and SUMMARY_TRUNCATE_THRESHOLD = 200) and
replace the magic numbers in the condition that uses last_assistant_len and
summary.chars().count() (the if in src/agent/channel.rs) with those constants;
declare the constants near the top of the module or next to related functions
(so intent and tunability are clear) and add a short comment explaining each
constant's purpose.
- Around line 2754-2773: The current mapping closure over map(|(idx, r)| { ...
}) sends raw r.result when result_count == 1 and r.success, risking
unframed/unfriendly output; change the single-result branch in that closure so
even for a single successful result you emit a minimal framed message (for
example include idx+1, r.process_type and status as a short header and then
r.result) to match the multi-result format and keep output consistent with the
else branch that formats "Update {} ({}, {}):\n{}".
- Around line 2748-2817: The direct-relay branch using pending_results ->
direct_relay_text will always win when any pending_results contain non-empty
result strings, effectively bypassing the LLM retrigger path; add a clear
comment above the direct_relay_text construction explaining that the LLM path is
only reached when all pending_results have empty result strings and why that is
acceptable, and then change the multi-result handling to either produce a
concise summary instead of raw concatenation or gate raw concatenation behind a
flag: introduce/ call a summarization helper (e.g., summarize_pending_results)
when result_count > 1 (or consult a config flag) so users don’t get unfiltered
verbose outputs, and keep the existing history handling
(pop_retrigger_bridge_message, push Assistant message), response_tx send, and
retrigger_count/pending_* clearing logic intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eefbeddf-1193-4758-a171-90e9f4ba29a0
📒 Files selected for processing (1)
src/agent/channel.rs
| if last_assistant_len < 80 && summary.chars().count() > 200 { | ||
| tracing::warn!( | ||
| channel_id = %self.id, | ||
| last_assistant_len, | ||
| summary_len = summary.chars().count(), | ||
| "retrigger reply looks truncated; sending summary fallback" | ||
| ); | ||
| self.state.conversation_logger.log_bot_message( | ||
| &self.state.channel_id, | ||
| summary, | ||
| ); | ||
| if let Err(error) = self | ||
| .response_tx | ||
| .send(OutboundResponse::Text(summary.to_string())) | ||
| .await | ||
| { | ||
| tracing::error!( | ||
| %error, | ||
| channel_id = %self.id, | ||
| "failed to send retrigger truncation fallback reply" | ||
| ); | ||
| } | ||
| } else { | ||
| tracing::debug!( | ||
| channel_id = %self.id, | ||
| "skipping retrigger summary injection; relay reply already preserved" | ||
| ); | ||
| } |
There was a problem hiding this comment.
Truncation fallback sends message to user without updating history.
When the truncation guardrail triggers (lines 1728-1742), the full summary is sent to the user, but history still contains only the truncated assistant reply. This creates a mismatch: the user sees the full summary, but subsequent LLM turns will only see the truncated message in history.
The else branch at lines 1749-1776 correctly handles history injection. The fallback path should do the same to maintain consistency.
🔧 Proposed fix: update history after sending fallback
if last_assistant_len < 80 && summary.chars().count() > 200 {
tracing::warn!(
channel_id = %self.id,
last_assistant_len,
summary_len = summary.chars().count(),
"retrigger reply looks truncated; sending summary fallback"
);
self.state.conversation_logger.log_bot_message(
&self.state.channel_id,
summary,
);
if let Err(error) = self
.response_tx
.send(OutboundResponse::Text(summary.to_string()))
.await
{
tracing::error!(
%error,
channel_id = %self.id,
"failed to send retrigger truncation fallback reply"
);
}
+ // Update history so the LLM sees what the user actually received
+ let mut history = self.state.history.write().await;
+ pop_retrigger_bridge_message(&mut history);
+ history.push(rig::message::Message::Assistant {
+ id: None,
+ content: OneOrMany::one(rig::message::AssistantContent::text(
+ summary.to_string(),
+ )),
+ });
} else {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if last_assistant_len < 80 && summary.chars().count() > 200 { | |
| tracing::warn!( | |
| channel_id = %self.id, | |
| last_assistant_len, | |
| summary_len = summary.chars().count(), | |
| "retrigger reply looks truncated; sending summary fallback" | |
| ); | |
| self.state.conversation_logger.log_bot_message( | |
| &self.state.channel_id, | |
| summary, | |
| ); | |
| if let Err(error) = self | |
| .response_tx | |
| .send(OutboundResponse::Text(summary.to_string())) | |
| .await | |
| { | |
| tracing::error!( | |
| %error, | |
| channel_id = %self.id, | |
| "failed to send retrigger truncation fallback reply" | |
| ); | |
| } | |
| } else { | |
| tracing::debug!( | |
| channel_id = %self.id, | |
| "skipping retrigger summary injection; relay reply already preserved" | |
| ); | |
| } | |
| if last_assistant_len < 80 && summary.chars().count() > 200 { | |
| tracing::warn!( | |
| channel_id = %self.id, | |
| last_assistant_len, | |
| summary_len = summary.chars().count(), | |
| "retrigger reply looks truncated; sending summary fallback" | |
| ); | |
| self.state.conversation_logger.log_bot_message( | |
| &self.state.channel_id, | |
| summary, | |
| ); | |
| if let Err(error) = self | |
| .response_tx | |
| .send(OutboundResponse::Text(summary.to_string())) | |
| .await | |
| { | |
| tracing::error!( | |
| %error, | |
| channel_id = %self.id, | |
| "failed to send retrigger truncation fallback reply" | |
| ); | |
| } | |
| // Update history so the LLM sees what the user actually received | |
| let mut history = self.state.history.write().await; | |
| pop_retrigger_bridge_message(&mut history); | |
| history.push(rig::message::Message::Assistant { | |
| id: None, | |
| content: OneOrMany::one(rig::message::AssistantContent::text( | |
| summary.to_string(), | |
| )), | |
| }); | |
| } else { | |
| tracing::debug!( | |
| channel_id = %self.id, | |
| "skipping retrigger summary injection; relay reply already preserved" | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel.rs` around lines 1721 - 1748, When the truncation fallback
sends the full summary to the user (the branch that calls
self.state.conversation_logger.log_bot_message and
self.response_tx.send(OutboundResponse::Text(summary.to_string())), you must
also update the conversation history the same way the else branch does so future
LLM turns see the full summary; mirror the history-injection logic from the else
branch (the same call(s) that the else path uses to persist/append the preserved
assistant reply into memory) immediately after a successful send, and handle any
errors similarly to maintain consistency between what the user saw and what is
stored.
|
This was by design and I personally have not had much issue with it however I can see why you might want this, could we make this a setting that is off by default? Ideally per agent in config → Channel Behavior |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/agent/channel.rs (1)
1809-1830:⚠️ Potential issue | 🟠 MajorTruncation fallback still creates history mismatch.
When the truncation guardrail triggers (lines 1809-1830), the full summary is sent to the user, but the conversation history retains the truncated assistant reply that was preserved via
retrigger_reply_preserved. This means subsequent LLM turns will see the short teaser in history while the user saw the full summary.The
elsebranch at lines 1837-1863 correctly injects content into history. The truncation fallback should do the same:🔧 Proposed fix: update history after sending fallback
if let Err(error) = self .response_tx .send(OutboundResponse::Text(summary.to_string())) .await { tracing::error!( %error, channel_id = %self.id, "failed to send retrigger truncation fallback reply" ); } + // Sync history with what user actually received + let mut history = self.state.history.write().await; + pop_retrigger_bridge_message(&mut history); + history.push(rig::message::Message::Assistant { + id: None, + content: OneOrMany::one(rig::message::AssistantContent::text( + summary.to_string(), + )), + }); } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 1809 - 1830, The truncation fallback sends the full summary to the user but leaves the earlier truncated teaser stored in retrigger_reply_preserved, causing history mismatch; after sending the fallback (the block that calls self.state.conversation_logger.log_bot_message and self.response_tx.send), remove/replace the preserved teaser so history reflects the full summary—specifically, clear or overwrite retrigger_reply_preserved (or call the conversation logger’s method to delete/replace the last assistant message) so the preserved truncated reply is not left in state.
🧹 Nitpick comments (1)
src/agent/channel.rs (1)
2908-2910: Consider logging when falling through to LLM path.When
direct_relay_text.trim().is_empty(), execution silently falls through to the LLM relay path. Adding a debug log here would help diagnose cases where results exist but produce empty relay text:📝 Suggested logging
if !direct_relay_text.trim().is_empty() { tracing::info!( channel_id = %self.id, result_count, relay_len = direct_relay_text.len(), "relaying retrigger results directly without LLM" ); // ... direct relay logic + } else { + tracing::debug!( + channel_id = %self.id, + result_count, + "direct relay text empty, falling through to LLM relay" + ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 2908 - 2910, When the branch that checks direct_relay_text.trim().is_empty() falls through to the LLM relay path there is no log entry; add a tracing::debug (or trace) call right after the if that triggers when direct_relay_text is empty to record that we're falling back to the LLM relay path, include channel_id = %self.id and any small context like the length or preview of direct_relay_text (e.g. len or truncated slice) and a short message "falling back to LLM relay" so it's easy to correlate with the existing tracing::info that logs when direct_relay_text is non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/agent/channel.rs`:
- Around line 1809-1830: The truncation fallback sends the full summary to the
user but leaves the earlier truncated teaser stored in
retrigger_reply_preserved, causing history mismatch; after sending the fallback
(the block that calls self.state.conversation_logger.log_bot_message and
self.response_tx.send), remove/replace the preserved teaser so history reflects
the full summary—specifically, clear or overwrite retrigger_reply_preserved (or
call the conversation logger’s method to delete/replace the last assistant
message) so the preserved truncated reply is not left in state.
---
Nitpick comments:
In `@src/agent/channel.rs`:
- Around line 2908-2910: When the branch that checks
direct_relay_text.trim().is_empty() falls through to the LLM relay path there is
no log entry; add a tracing::debug (or trace) call right after the if that
triggers when direct_relay_text is empty to record that we're falling back to
the LLM relay path, include channel_id = %self.id and any small context like the
length or preview of direct_relay_text (e.g. len or truncated slice) and a short
message "falling back to LLM relay" so it's easy to correlate with the existing
tracing::info that logs when direct_relay_text is non-empty.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7b192630-ded7-46ef-a560-45bcf90fe9a4
📒 Files selected for processing (1)
src/agent/channel.rs
| }; | ||
|
|
||
| if last_assistant_len < 80 && summary.chars().count() > 200 { | ||
| tracing::warn!( |
There was a problem hiding this comment.
In the truncation guardrail path, we send summary as a fallback, but we don't update self.state.history, so the in-memory transcript can stay as the teaser-only assistant message. Consider replacing the last assistant message (or injecting summary) when this triggers so subsequent turns/compaction match what the user actually saw.
| }); | ||
|
|
||
| self.retrigger_count += 1; | ||
| self.pending_retrigger = false; |
There was a problem hiding this comment.
Direct relay skips the retrigger LLM turn, so the StatusBlock::mark_relayed path in handle_message won't run. That means completed items can keep showing full result_summary in the status block and may get re-mentioned on later turns.
Consider marking the just-relayed process_ids as relayed here before clearing pending_results.
| self.pending_retrigger = false; | |
| { | |
| let process_ids: Vec<String> = self | |
| .pending_results | |
| .iter() | |
| .map(|result| result.process_id.clone()) | |
| .collect(); | |
| let mut status = self.state.status_block.write().await; | |
| status.mark_relayed(&process_ids); | |
| } | |
| self.retrigger_count += 1; |
Problem
When a branch/worker completes and the retrigger relay LLM pass runs, the model sometimes generates only a teaser (e.g.
Hey — here's where we stand:) instead of relaying the full result. This leaves users seeing truncated responses with no way to get the actual content.Root Cause
The retrigger flow runs a second LLM turn to "summarize and relay" branch results. But the model often produces a short intro and stops, treating the relay as conversational rather than faithfully forwarding the output.
Relationship to #333
PR #333 (
fix(channel): stop re-summarising worker results that were already relayed) addressed a related but different issue: re-iteration of already-relayed results on subsequent turns. It adds arelayedflag and filters the status block.This PR addresses the initial relay truncation — the first time results are delivered, the LLM relay pass can still produce a teaser instead of the full output, even with #333's fixes in place. The two PRs are complementary:
Fix
1. Direct relay mode (primary fix)
Skip the second LLM pass entirely and send branch/worker results directly to the user:
2. Truncation guardrail (defense in depth)
If the LLM relay path is still reached (direct relay text was empty):
Trade-offs
This is an aggressive approach — it bypasses the LLM relay entirely, so results are forwarded raw without conversational framing. This guarantees completeness but loses the "natural language summary" behavior. A softer alternative would be to retry the LLM pass with a stricter prompt, but that adds latency and still risks truncation.
Testing
Tested with branch operations that produce multi-paragraph results. Before: users saw
Hey — here's where we stand:and nothing else. After: full branch output is delivered immediately without a second LLM pass.