-
Notifications
You must be signed in to change notification settings - Fork 2k
Claude Code driver: use persistent sessions instead of one-shot subprocess per message #975
Description
Feature Request
The claude-code LLM driver currently spawns a new claude -p "prompt" --output-format json subprocess for every single message. This means:
- Full CLI startup cost (~2-3s) on every message
- No conversation history between messages (each is stateless)
- No session persistence for agents
- High latency for interactive use cases like Discord chat
Current Behavior
ClaudeCodeDriver::complete() in openfang-runtime/src/drivers/claude_code.rs calls:
Command::new(&self.cli_path)
.arg("-p")
.arg(&prompt)
.arg("--output-format")
.arg("json")Every message spawns a new process, waits for it to finish, then parses the output.
Proposed Behavior
The Claude Code CLI supports persistent sessions via --session-id <id> and can also run as an MCP server or in conversation mode. The driver could:
- Reuse session IDs — pass
--session-id <agent-id>so conversation context persists across messages without re-sending the full history each time - Keep a long-running subprocess — use
claude --output-format stream-jsonin an interactive mode instead of spawning per-message - Use the Claude Code MCP server —
claude mcp serveexposes Claude's capabilities over MCP stdio, which could be more efficient than repeated subprocess spawns
Any of these would significantly reduce latency for interactive channels (Discord, Telegram, etc.) where users expect near-instant responses.
Impact
This affects all users of the claude-code provider, but is especially noticeable on channel integrations where back-and-forth conversation is the primary use case.