feat: MCP Server Architecture for Checkpoint-Based Workflow Execution by nhorton · Pull Request #200 · Unsupervisedcom/deepwork

nhorton · 2026-02-04T21:16:55Z

Summary

This PR introduces a major architectural shift from skill-file-based workflow execution to a Model Context Protocol (MCP) server that guides agents through workflows via checkpoint calls with quality gate enforcement.

Key changes:

New MCP Server (deepwork serve) with three tools: get_workflows, start_workflow, finished_step
Quality Gates that evaluate step outputs against criteria using Claude Code subprocess
Nested Workflow Support with stack-based execution and abort_workflow capability
Simplified Skill Generation - single /deepwork entry point instead of per-step skills
Rules System Removed - entire rules subsystem (parser, queue, pattern matcher, hooks) deleted

Why This Change?

The previous architecture relied heavily on skill files with embedded instructions and rules-based hooks. This had several limitations:

Complex rules evaluation at every agent stop event
Difficult to track workflow state across steps
No structured quality enforcement
Hard to resume or debug workflows

The MCP approach provides:

Centralized state - Session state persisted and visible in .deepwork/tmp/
Quality gates - Automated validation before proceeding to next step
Structured checkpoints - Clear handoff points between steps
Resumability - Sessions can be loaded and resumed
Observability - All state changes logged and inspectable

Changes by Area

New MCP Module (`src/deepwork/mcp/`)

server.py - FastMCP server definition
tools.py - MCP tool implementations
state.py - Workflow session state management
schemas.py - Pydantic models for I/O
quality_gate.py - Quality gate with review agent

New CLI Command

deepwork serve - Starts MCP server (stdio or SSE transport)

Updated `deepwork_jobs` Standard Job

New steps: iterate, errata, test, fix_jobs, fix_settings
Streamlined define, implement, learn steps

Removed Components

Entire rules system (rules_parser.py, rules_queue.py, pattern_matcher.py, rules_check.py)
Command executor (command_executor.py)
deepwork_rules standard job
Per-step skill templates
Many hook scripts
commit and manual_tests jobs

Documentation

New doc/mcp_interface.md - MCP tool reference
New doc/reference/calling_claude_in_print_mode.md - Claude CLI subprocess guide
Updated doc/architecture.md with Part 4: MCP Server Architecture
Updated README.md to remove rules references

Test plan

Run deepwork install --platform claude in a test project
Verify MCP server starts with deepwork serve
Test workflow execution via /deepwork skill
Verify quality gate evaluation works
Run existing test suite: uv run pytest

🤖 Generated with Claude Code

- Add configurable quality_gate settings to config.yml (agent_review_command, default_timeout, default_max_attempts) - Update installer to create quality_gate config section with defaults - Refactor QualityGate to separate system instructions from user payload - Use -s flag to pass instructions as system prompt to review agent - Change file separator format to 20 dashes for clearer delineation - Remove step_instructions from QualityGate interface (not useful for review) - Add quality_review_override_reason to finished_step to skip quality gate - Add JSON schema validation for quality gate responses - Add comprehensive integration tests with mock review agent subprocess - Remove block_bash_with_instructions hook (commit skill not available) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Update e2e tests for Claude Code integration - Add quality_criteria to fruits job fixture - Fix test assertions for updated install flow - Minor sync.py adjustments Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The rules system was removed in commit 6b3e1a2. This cleans up stale documentation references to rules_check in hook-related code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- StateManager now uses a session stack instead of single active session - Starting a workflow while one is active pushes onto the stack - Completing a workflow pops from stack and resumes parent - Added abort_workflow tool with explanation parameter - All tool responses include stack field [{workflow, step}, ...] - Added logging to all MCP tool calls with stack info - Updated server instructions to document nesting and abort Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add `from None` to raise in except clause (B904) - Remove unused variables in tests (F841) - Rename unused loop variable to underscore prefix (B007) - Apply ruff formatting to 14 files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace flake-utils with uv2nix/pyproject-nix for proper Python dependency management in Nix. This provides hermetic builds directly from uv.lock and supports editable installs for development. Key changes: - Use uv2nix to generate Python package set from uv.lock - Add pyproject-build-systems for build dependency resolution - Add editables to build-system requires (needed by hatchling for editable wheel builds) - Remove .venv management from shell hook (Nix handles it now) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix quality_gate.py to handle Claude CLI --output-format json wrapper objects by extracting the 'result' field before parsing - Add tests for wrapper object handling with strong comments explaining the mock design - Remove deprecated 'exposed' field from learn step in deepwork_jobs - Add 'learn' workflow to make orphaned step accessible via MCP - Add 'update' workflow to update job for MCP compatibility - Migrate stop_hooks to quality_criteria in update job - Clean up settings.json by removing obsolete Skill permissions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Document the major architectural changes including: - New MCP server with checkpoint-based workflow execution - Removal of the rules system - Simplified skill generation - New deepwork_jobs steps Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Mark 0.7.0 as alpha prerelease so that `uv add deepwork` continues to install the stable 0.5.1 by default, requiring explicit version specification for the new alpha. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Doc specs were never enforced programmatically — the infrastructure to parse them exists but was never wired into quality gates. Remove all doc spec guidance from job instructions to avoid misleading users into creating artifacts that have no effect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

step_expected_outputs is now an array of ExpectedOutput objects (name, type, description, syntax_for_finished_step_tool) instead of a plain list of names. This tells agents exactly what format to use when calling finished_step — "filepath" for file outputs and "array of filepaths for all individual files" for files outputs — eliminating the string-vs-list type mismatch errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…work into mcp-variant

nhorton and others added 25 commits February 3, 2026 12:14

Removed rules

6b3e1a2

Port theoretically done

26a9911

mcp loads now

9b633b0

chore: Update tests and sync for MCP variant

cd2ae63

- Update e2e tests for Claude Code integration - Add quality_criteria to fruits job fixture - Fix test assertions for updated install flow - Minor sync.py adjustments Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Cleaned up MCP rules

a3fae18

Remove old jobs

c3754f6

chore: Remove dead rules_check references from docstrings

fd0d348

The rules system was removed in commit 6b3e1a2. This cleans up stale documentation references to rules_check in hook-related code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

async

443e13e

repair added but not run

0b3d666

cleaned up

f3af9d6

Version bump

c5c9f97

Merge branch 'main' into mcp-variant

9e63119

Fix ruff lint errors and apply formatting

88477a4

- Add `from None` to raise in except clause (B904) - Remove unused variables in tests (F841) - Rename unused loop variable to underscore prefix (B007) - Apply ruff formatting to 14 files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

cleanups

897535c

MCP command updated

18043b3

add_job improved

fa40407

tighter instructions

3e21805

stop backing up rules

e122265

make_new_job.sh preserved, parallel execution, no dupe quality criteria

0000c17

formatting

d570baf

nhorton changed the title ~~Mcp variant~~ feat: MCP Server Architecture for Checkpoint-Based Workflow Execution Feb 5, 2026

remove update job

b561e2a

nhorton temporarily deployed to pypi February 5, 2026 18:40 — with GitHub Actions Inactive

Fix release version to prerelease (0.7.0a1)

48e23fe

Mark 0.7.0 as alpha prerelease so that `uv add deepwork` continues to install the stable 0.5.1 by default, requiring explicit version specification for the new alpha. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

nhorton temporarily deployed to pypi February 5, 2026 20:05 — with GitHub Actions Inactive

nhorton and others added 4 commits February 5, 2026 13:41

nix fixed

960acaa

remove repair summary

9d074ee

make mcp tolerant to name errors in workflow name

8471048

nhorton force-pushed the mcp-variant branch from 8c8c1f5 to 8471048 Compare February 5, 2026 23:02

Bump version to 0.7.0a2

2b8e85f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nhorton deployed to pypi February 5, 2026 23:11 — with GitHub Actions Active

nhorton and others added 6 commits February 5, 2026 16:28

refactor the quality gate

089438e

Merge branch 'main' into mcp-variant

2ce38fb

Typed file outputs

b53519b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'mcp-variant' of https://github.com/Unsupervisedcom/deep…

cff723f

…work into mcp-variant

ready to test

b96d22a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MCP Server Architecture for Checkpoint-Based Workflow Execution#200

feat: MCP Server Architecture for Checkpoint-Based Workflow Execution#200
nhorton wants to merge 38 commits intomainfrom
mcp-variant

nhorton commented Feb 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nhorton commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why This Change?

Changes by Area

New MCP Module (src/deepwork/mcp/)

New CLI Command

Updated deepwork_jobs Standard Job

Removed Components

Documentation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nhorton commented Feb 4, 2026 •

edited

Loading

New MCP Module (`src/deepwork/mcp/`)

Updated `deepwork_jobs` Standard Job