Skip to content

feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0)#622

Open
garrytan wants to merge 11 commits intomainfrom
garrytan/ce-features
Open

feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0)#622
garrytan wants to merge 11 commits intomainfrom
garrytan/ce-features

Conversation

@garrytan
Copy link
Copy Markdown
Owner

Summary

Every session now makes the next one smarter. Per-project institutional knowledge that compounds across skills.

New: learnings persistence, /learn skill, confidence calibration, cross-project discovery, confidence decay, learnings count in preamble.

Infrastructure: 2 bin scripts, 2 resolver modules, 9 template integrations, 1 design doc.

Test Coverage

601 tests pass, 0 fail. 13 new unit tests for bin scripts.

Pre-Landing Review

CEO + Eng Review CLEARED. 2 Codex outside voice runs (4 findings accepted).

Test plan

  • bun test — 601 pass, 0 fail
  • Manual bin script verification
  • gen:skill-docs freshness check

🤖 Generated with Claude Code

garrytan and others added 11 commits March 28, 2026 22:52
…cture

Three new resolvers for the self-learning system:
- LEARNINGS_SEARCH: tells skills to load prior learnings before analysis
- LEARNINGS_LOG: tells skills to capture discoveries after completing work
- CONFIDENCE_CALIBRATION: adds 1-10 confidence scoring to all review findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gstack-learnings-log: validates JSON, auto-injects timestamp, appends to
~/.gstack/projects/$SLUG/learnings.jsonl. Append-only (no mutation).

gstack-learnings-search: reads/filters/dedupes learnings with confidence
decay (observed/inferred lose 1pt/30d), cross-project discovery, and
"latest winner" resolution per key+type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every skill now prints "LEARNINGS: N entries loaded" during preamble,
making the compounding loop visible to the user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add {{LEARNINGS_SEARCH}}, {{LEARNINGS_LOG}}, and {{CONFIDENCE_CALIBRATION}}
placeholders to review, ship, plan-eng-review, plan-ceo-review, office-hours,
investigate, retro, and cso templates. Regenerated all SKILL.md files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New skill for reviewing, searching, pruning, and exporting what gstack
has learned across sessions. Commands: /learn, /learn search, /learn prune,
/learn export, /learn stats, /learn add.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers: R1 GStack Learns (v0.14), R2 Review Army (v0.15), R3 Smart Ceremony
(v0.16), R4 /autoship (v0.17), R5 Studio (v0.18). Inspired by Compound
Engineering, adapted to GStack's architecture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests gstack-learnings-log (valid/invalid JSON, timestamp injection,
append-only) and gstack-learnings-search (dedup, type/query/limit filters,
confidence decay, user-stated no-decay, malformed JSONL skip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… free

Adds gen-skill-docs coverage for LEARNINGS_SEARCH, LEARNINGS_LOG, and
CONFIDENCE_CALIBRATION resolvers. Adds bin script edge cases: timestamp
preservation, special characters, files array, sort order, type grouping,
combined filtering, missing fields, confidence floor at 0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Main landed v0.13.4.0 (Sidebar Defense) while this branch also used
v0.13.4.0 (GStack Learns). Resolved by bumping this branch to v0.13.5.0
and keeping both entries in chronological order.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

E2E Evals: ✅ PASS

56/56 tests passed | $5.85 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 4/4 $0.16
e2e-deploy 6/6 $1.04
e2e-design 3/3 $0.58
e2e-plan 7/7 $1.12
e2e-qa-workflow 3/3 $1.04
e2e-review 6/6 $1.14
e2e-workflow 3/3 $0.29
llm-judge 24/24 $0.48

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant