Skip to content

Claude Code driver: use persistent sessions instead of one-shot subprocess per message #975

@aris-relay

Description

@aris-relay

Feature Request

The claude-code LLM driver currently spawns a new claude -p "prompt" --output-format json subprocess for every single message. This means:

  • Full CLI startup cost (~2-3s) on every message
  • No conversation history between messages (each is stateless)
  • No session persistence for agents
  • High latency for interactive use cases like Discord chat

Current Behavior

ClaudeCodeDriver::complete() in openfang-runtime/src/drivers/claude_code.rs calls:

Command::new(&self.cli_path)
    .arg("-p")
    .arg(&prompt)
    .arg("--output-format")
    .arg("json")

Every message spawns a new process, waits for it to finish, then parses the output.

Proposed Behavior

The Claude Code CLI supports persistent sessions via --session-id <id> and can also run as an MCP server or in conversation mode. The driver could:

  1. Reuse session IDs — pass --session-id <agent-id> so conversation context persists across messages without re-sending the full history each time
  2. Keep a long-running subprocess — use claude --output-format stream-json in an interactive mode instead of spawning per-message
  3. Use the Claude Code MCP serverclaude mcp serve exposes Claude's capabilities over MCP stdio, which could be more efficient than repeated subprocess spawns

Any of these would significantly reduce latency for interactive channels (Discord, Telegram, etc.) where users expect near-instant responses.

Impact

This affects all users of the claude-code provider, but is especially noticeable on channel integrations where back-and-forth conversation is the primary use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions