Skip to content

fix(channel): direct relay retrigger results to prevent truncated replies#367

Open
lutr0 wants to merge 2 commits intospacedriveapp:mainfrom
lutr0:fix/retrigger-truncated-replies
Open

fix(channel): direct relay retrigger results to prevent truncated replies#367
lutr0 wants to merge 2 commits intospacedriveapp:mainfrom
lutr0:fix/retrigger-truncated-replies

Conversation

@lutr0
Copy link

@lutr0 lutr0 commented Mar 8, 2026

Problem

When a branch/worker completes and the retrigger relay LLM pass runs, the model sometimes generates only a teaser (e.g. Hey — here's where we stand:) instead of relaying the full result. This leaves users seeing truncated responses with no way to get the actual content.

Root Cause

The retrigger flow runs a second LLM turn to "summarize and relay" branch results. But the model often produces a short intro and stops, treating the relay as conversational rather than faithfully forwarding the output.

Relationship to #333

PR #333 (fix(channel): stop re-summarising worker results that were already relayed) addressed a related but different issue: re-iteration of already-relayed results on subsequent turns. It adds a relayed flag and filters the status block.

This PR addresses the initial relay truncation — the first time results are delivered, the LLM relay pass can still produce a teaser instead of the full output, even with #333's fixes in place. The two PRs are complementary:

Fix

1. Direct relay mode (primary fix)

Skip the second LLM pass entirely and send branch/worker results directly to the user:

  • Single successful result → raw output sent as-is
  • Multiple results or failures → structured summary composed automatically
  • Result injected into conversation history for continuity

2. Truncation guardrail (defense in depth)

If the LLM relay path is still reached (direct relay text was empty):

  • Detect suspiciously short replies (<80 chars) when the queued summary is substantial (>200 chars)
  • Automatically send the full summary as a fallback

Trade-offs

This is an aggressive approach — it bypasses the LLM relay entirely, so results are forwarded raw without conversational framing. This guarantees completeness but loses the "natural language summary" behavior. A softer alternative would be to retry the LLM pass with a stricter prompt, but that adds latency and still risks truncation.

Testing

Tested with branch operations that produce multi-paragraph results. Before: users saw Hey — here's where we stand: and nothing else. After: full branch output is delivered immediately without a second LLM pass.

…lies

When a branch/worker completes and the retrigger relay LLM pass runs,
the model sometimes generates only a teaser (e.g. "Hey — here's where
we stand:") instead of relaying the full result. This leaves users
seeing truncated responses.

Two fixes:

1. Direct relay mode: skip the second LLM pass entirely and send
   branch/worker results directly to the user. For single successful
   results, the raw output is sent as-is. For multiple results or
   failures, a structured summary is composed. This eliminates the
   truncation problem at its source.

2. Truncation guardrail (defense in depth): if the LLM relay path is
   still reached and produces a suspiciously short reply (<80 chars)
   while the queued summary is substantial (>200 chars), automatically
   send the full summary as a fallback.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

Walkthrough

Adds deterministic direct-relay branches for retrigger results: short-circuiting LLM turns to emit pending retrigger summaries directly (with logging and error handling) and a guardrail to fallback to direct-relay when summary length vs last assistant message meets thresholds.

Changes

Cohort / File(s) Summary
Retrigger relay logic
src/agent/channel.rs
Adds direct-relay fallback in retrigger handling when last assistant message <80 chars and summary >200 chars: logs warning, emits summary as bot message, attempts direct outbound send, logs send failures. Adds direct-relay branch in flush_pending_retrigger to build and emit direct_relay_text from pending_results, append to history, reset retrigger state, and short-circuit LLM turn. No public API changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding direct relay functionality for retrigger results to prevent truncated replies, which is the primary fix in the changeset.
Description check ✅ Passed The description comprehensively explains the problem, root cause, relationship to prior work, implemented fix with two components, trade-offs, and testing approach—all directly relevant to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/agent/channel.rs (3)

1721-1721: Consider extracting truncation thresholds as named constants.

The magic numbers 80 and 200 encode important heuristics but lack documentation. Named constants would clarify intent and make future tuning easier.

+const TRUNCATED_REPLY_THRESHOLD_CHARS: usize = 80;
+const SUBSTANTIAL_SUMMARY_THRESHOLD_CHARS: usize = 200;
+
 // In the is_retrigger block:
-if last_assistant_len < 80 && summary.chars().count() > 200 {
+if last_assistant_len < TRUNCATED_REPLY_THRESHOLD_CHARS 
+    && summary.chars().count() > SUBSTANTIAL_SUMMARY_THRESHOLD_CHARS {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` at line 1721, Extract the numeric truncation thresholds
into well-named constants (e.g., MIN_LAST_ASSISTANT_LEN = 80 and
SUMMARY_TRUNCATE_THRESHOLD = 200) and replace the magic numbers in the condition
that uses last_assistant_len and summary.chars().count() (the if in
src/agent/channel.rs) with those constants; declare the constants near the top
of the module or next to related functions (so intent and tunability are clear)
and add a short comment explaining each constant's purpose.

2754-2773: Direct relay sends raw worker output for single successful results.

For a single successful result, the raw r.result is sent directly without any framing. This could include very long or unformatted output that wasn't intended to be user-facing.

Consider whether single-result output should also get minimal framing (e.g., a brief header) to maintain consistency with the multi-result format, or if the raw relay is intentional for simplicity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 2754 - 2773, The current mapping closure
over map(|(idx, r)| { ... }) sends raw r.result when result_count == 1 and
r.success, risking unframed/unfriendly output; change the single-result branch
in that closure so even for a single successful result you emit a minimal framed
message (for example include idx+1, r.process_type and status as a short header
and then r.result) to match the multi-result format and keep output consistent
with the else branch that formats "Update {} ({}, {}):\n{}".

2748-2817: Direct relay effectively deprecates the LLM-based retrigger path.

The direct relay path will execute whenever pending_results is non-empty (which is guaranteed by the check at line 2736). The only way to reach the LLM-based retrigger code starting at line 2819 is if all pending results have empty result strings — an unlikely edge case.

This is a significant behavior change: users will receive raw worker/branch output instead of LLM-synthesized summaries. This aligns with the PR objective of avoiding truncated teaser replies, but consider:

  1. Adding a comment explaining when the LLM path would be reached
  2. For multiple results, users now receive all outputs concatenated rather than a summarized relay

The history handling is correct — pop_retrigger_bridge_message is called before pushing the new assistant message, maintaining the invariant against consecutive assistant messages.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 2748 - 2817, The direct-relay branch using
pending_results -> direct_relay_text will always win when any pending_results
contain non-empty result strings, effectively bypassing the LLM retrigger path;
add a clear comment above the direct_relay_text construction explaining that the
LLM path is only reached when all pending_results have empty result strings and
why that is acceptable, and then change the multi-result handling to either
produce a concise summary instead of raw concatenation or gate raw concatenation
behind a flag: introduce/ call a summarization helper (e.g.,
summarize_pending_results) when result_count > 1 (or consult a config flag) so
users don’t get unfiltered verbose outputs, and keep the existing history
handling (pop_retrigger_bridge_message, push Assistant message), response_tx
send, and retrigger_count/pending_* clearing logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/channel.rs`:
- Around line 1721-1748: When the truncation fallback sends the full summary to
the user (the branch that calls self.state.conversation_logger.log_bot_message
and self.response_tx.send(OutboundResponse::Text(summary.to_string())), you must
also update the conversation history the same way the else branch does so future
LLM turns see the full summary; mirror the history-injection logic from the else
branch (the same call(s) that the else path uses to persist/append the preserved
assistant reply into memory) immediately after a successful send, and handle any
errors similarly to maintain consistency between what the user saw and what is
stored.

---

Nitpick comments:
In `@src/agent/channel.rs`:
- Line 1721: Extract the numeric truncation thresholds into well-named constants
(e.g., MIN_LAST_ASSISTANT_LEN = 80 and SUMMARY_TRUNCATE_THRESHOLD = 200) and
replace the magic numbers in the condition that uses last_assistant_len and
summary.chars().count() (the if in src/agent/channel.rs) with those constants;
declare the constants near the top of the module or next to related functions
(so intent and tunability are clear) and add a short comment explaining each
constant's purpose.
- Around line 2754-2773: The current mapping closure over map(|(idx, r)| { ...
}) sends raw r.result when result_count == 1 and r.success, risking
unframed/unfriendly output; change the single-result branch in that closure so
even for a single successful result you emit a minimal framed message (for
example include idx+1, r.process_type and status as a short header and then
r.result) to match the multi-result format and keep output consistent with the
else branch that formats "Update {} ({}, {}):\n{}".
- Around line 2748-2817: The direct-relay branch using pending_results ->
direct_relay_text will always win when any pending_results contain non-empty
result strings, effectively bypassing the LLM retrigger path; add a clear
comment above the direct_relay_text construction explaining that the LLM path is
only reached when all pending_results have empty result strings and why that is
acceptable, and then change the multi-result handling to either produce a
concise summary instead of raw concatenation or gate raw concatenation behind a
flag: introduce/ call a summarization helper (e.g., summarize_pending_results)
when result_count > 1 (or consult a config flag) so users don’t get unfiltered
verbose outputs, and keep the existing history handling
(pop_retrigger_bridge_message, push Assistant message), response_tx send, and
retrigger_count/pending_* clearing logic intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eefbeddf-1193-4758-a171-90e9f4ba29a0

📥 Commits

Reviewing files that changed from the base of the PR and between 203c9d9 and e1d42f3.

📒 Files selected for processing (1)
  • src/agent/channel.rs

Comment on lines +1721 to +1748
if last_assistant_len < 80 && summary.chars().count() > 200 {
tracing::warn!(
channel_id = %self.id,
last_assistant_len,
summary_len = summary.chars().count(),
"retrigger reply looks truncated; sending summary fallback"
);
self.state.conversation_logger.log_bot_message(
&self.state.channel_id,
summary,
);
if let Err(error) = self
.response_tx
.send(OutboundResponse::Text(summary.to_string()))
.await
{
tracing::error!(
%error,
channel_id = %self.id,
"failed to send retrigger truncation fallback reply"
);
}
} else {
tracing::debug!(
channel_id = %self.id,
"skipping retrigger summary injection; relay reply already preserved"
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Truncation fallback sends message to user without updating history.

When the truncation guardrail triggers (lines 1728-1742), the full summary is sent to the user, but history still contains only the truncated assistant reply. This creates a mismatch: the user sees the full summary, but subsequent LLM turns will only see the truncated message in history.

The else branch at lines 1749-1776 correctly handles history injection. The fallback path should do the same to maintain consistency.

🔧 Proposed fix: update history after sending fallback
                 if last_assistant_len < 80 && summary.chars().count() > 200 {
                     tracing::warn!(
                         channel_id = %self.id,
                         last_assistant_len,
                         summary_len = summary.chars().count(),
                         "retrigger reply looks truncated; sending summary fallback"
                     );
                     self.state.conversation_logger.log_bot_message(
                         &self.state.channel_id,
                         summary,
                     );
                     if let Err(error) = self
                         .response_tx
                         .send(OutboundResponse::Text(summary.to_string()))
                         .await
                     {
                         tracing::error!(
                             %error,
                             channel_id = %self.id,
                             "failed to send retrigger truncation fallback reply"
                         );
                     }
+                    // Update history so the LLM sees what the user actually received
+                    let mut history = self.state.history.write().await;
+                    pop_retrigger_bridge_message(&mut history);
+                    history.push(rig::message::Message::Assistant {
+                        id: None,
+                        content: OneOrMany::one(rig::message::AssistantContent::text(
+                            summary.to_string(),
+                        )),
+                    });
                 } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if last_assistant_len < 80 && summary.chars().count() > 200 {
tracing::warn!(
channel_id = %self.id,
last_assistant_len,
summary_len = summary.chars().count(),
"retrigger reply looks truncated; sending summary fallback"
);
self.state.conversation_logger.log_bot_message(
&self.state.channel_id,
summary,
);
if let Err(error) = self
.response_tx
.send(OutboundResponse::Text(summary.to_string()))
.await
{
tracing::error!(
%error,
channel_id = %self.id,
"failed to send retrigger truncation fallback reply"
);
}
} else {
tracing::debug!(
channel_id = %self.id,
"skipping retrigger summary injection; relay reply already preserved"
);
}
if last_assistant_len < 80 && summary.chars().count() > 200 {
tracing::warn!(
channel_id = %self.id,
last_assistant_len,
summary_len = summary.chars().count(),
"retrigger reply looks truncated; sending summary fallback"
);
self.state.conversation_logger.log_bot_message(
&self.state.channel_id,
summary,
);
if let Err(error) = self
.response_tx
.send(OutboundResponse::Text(summary.to_string()))
.await
{
tracing::error!(
%error,
channel_id = %self.id,
"failed to send retrigger truncation fallback reply"
);
}
// Update history so the LLM sees what the user actually received
let mut history = self.state.history.write().await;
pop_retrigger_bridge_message(&mut history);
history.push(rig::message::Message::Assistant {
id: None,
content: OneOrMany::one(rig::message::AssistantContent::text(
summary.to_string(),
)),
});
} else {
tracing::debug!(
channel_id = %self.id,
"skipping retrigger summary injection; relay reply already preserved"
);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 1721 - 1748, When the truncation fallback
sends the full summary to the user (the branch that calls
self.state.conversation_logger.log_bot_message and
self.response_tx.send(OutboundResponse::Text(summary.to_string())), you must
also update the conversation history the same way the else branch does so future
LLM turns see the full summary; mirror the history-injection logic from the else
branch (the same call(s) that the else path uses to persist/append the preserved
assistant reply into memory) immediately after a successful send, and handle any
errors similarly to maintain consistency between what the user saw and what is
stored.

@jamiepine
Copy link
Member

This was by design and I personally have not had much issue with it however I can see why you might want this, could we make this a setting that is off by default? Ideally per agent in config → Channel Behavior

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/agent/channel.rs (1)

1809-1830: ⚠️ Potential issue | 🟠 Major

Truncation fallback still creates history mismatch.

When the truncation guardrail triggers (lines 1809-1830), the full summary is sent to the user, but the conversation history retains the truncated assistant reply that was preserved via retrigger_reply_preserved. This means subsequent LLM turns will see the short teaser in history while the user saw the full summary.

The else branch at lines 1837-1863 correctly injects content into history. The truncation fallback should do the same:

🔧 Proposed fix: update history after sending fallback
                     if let Err(error) = self
                         .response_tx
                         .send(OutboundResponse::Text(summary.to_string()))
                         .await
                     {
                         tracing::error!(
                             %error,
                             channel_id = %self.id,
                             "failed to send retrigger truncation fallback reply"
                         );
                     }
+                    // Sync history with what user actually received
+                    let mut history = self.state.history.write().await;
+                    pop_retrigger_bridge_message(&mut history);
+                    history.push(rig::message::Message::Assistant {
+                        id: None,
+                        content: OneOrMany::one(rig::message::AssistantContent::text(
+                            summary.to_string(),
+                        )),
+                    });
                 } else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 1809 - 1830, The truncation fallback sends
the full summary to the user but leaves the earlier truncated teaser stored in
retrigger_reply_preserved, causing history mismatch; after sending the fallback
(the block that calls self.state.conversation_logger.log_bot_message and
self.response_tx.send), remove/replace the preserved teaser so history reflects
the full summary—specifically, clear or overwrite retrigger_reply_preserved (or
call the conversation logger’s method to delete/replace the last assistant
message) so the preserved truncated reply is not left in state.
🧹 Nitpick comments (1)
src/agent/channel.rs (1)

2908-2910: Consider logging when falling through to LLM path.

When direct_relay_text.trim().is_empty(), execution silently falls through to the LLM relay path. Adding a debug log here would help diagnose cases where results exist but produce empty relay text:

📝 Suggested logging
         if !direct_relay_text.trim().is_empty() {
             tracing::info!(
                 channel_id = %self.id,
                 result_count,
                 relay_len = direct_relay_text.len(),
                 "relaying retrigger results directly without LLM"
             );
             // ... direct relay logic
+        } else {
+            tracing::debug!(
+                channel_id = %self.id,
+                result_count,
+                "direct relay text empty, falling through to LLM relay"
+            );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 2908 - 2910, When the branch that checks
direct_relay_text.trim().is_empty() falls through to the LLM relay path there is
no log entry; add a tracing::debug (or trace) call right after the if that
triggers when direct_relay_text is empty to record that we're falling back to
the LLM relay path, include channel_id = %self.id and any small context like the
length or preview of direct_relay_text (e.g. len or truncated slice) and a short
message "falling back to LLM relay" so it's easy to correlate with the existing
tracing::info that logs when direct_relay_text is non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/agent/channel.rs`:
- Around line 1809-1830: The truncation fallback sends the full summary to the
user but leaves the earlier truncated teaser stored in
retrigger_reply_preserved, causing history mismatch; after sending the fallback
(the block that calls self.state.conversation_logger.log_bot_message and
self.response_tx.send), remove/replace the preserved teaser so history reflects
the full summary—specifically, clear or overwrite retrigger_reply_preserved (or
call the conversation logger’s method to delete/replace the last assistant
message) so the preserved truncated reply is not left in state.

---

Nitpick comments:
In `@src/agent/channel.rs`:
- Around line 2908-2910: When the branch that checks
direct_relay_text.trim().is_empty() falls through to the LLM relay path there is
no log entry; add a tracing::debug (or trace) call right after the if that
triggers when direct_relay_text is empty to record that we're falling back to
the LLM relay path, include channel_id = %self.id and any small context like the
length or preview of direct_relay_text (e.g. len or truncated slice) and a short
message "falling back to LLM relay" so it's easy to correlate with the existing
tracing::info that logs when direct_relay_text is non-empty.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7b192630-ded7-46ef-a560-45bcf90fe9a4

📥 Commits

Reviewing files that changed from the base of the PR and between e1d42f3 and fdfd814.

📒 Files selected for processing (1)
  • src/agent/channel.rs

};

if last_assistant_len < 80 && summary.chars().count() > 200 {
tracing::warn!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the truncation guardrail path, we send summary as a fallback, but we don't update self.state.history, so the in-memory transcript can stay as the teaser-only assistant message. Consider replacing the last assistant message (or injecting summary) when this triggers so subsequent turns/compaction match what the user actually saw.

});

self.retrigger_count += 1;
self.pending_retrigger = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct relay skips the retrigger LLM turn, so the StatusBlock::mark_relayed path in handle_message won't run. That means completed items can keep showing full result_summary in the status block and may get re-mentioned on later turns.

Consider marking the just-relayed process_ids as relayed here before clearing pending_results.

Suggested change
self.pending_retrigger = false;
{
let process_ids: Vec<String> = self
.pending_results
.iter()
.map(|result| result.process_id.clone())
.collect();
let mut status = self.state.status_block.write().await;
status.mark_relayed(&process_ids);
}
self.retrigger_count += 1;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants