Skip to content

Bbasche/ooda-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ooda-engine

Self-improving AI engine that makes your codebase better, autonomously.

Point it at any code that produces scorable output — websites, emails, API responses, LLM prompts, data pipelines — and it will iteratively improve the source code through multi-agent critique, voting, and verified edits.

How it works

Each iteration runs the full OODA loop:

BUILD → OBSERVE → ORIENT → DECIDE → ACT → VERIFY
  1. Build — Run your build command to produce artifacts
  2. Observe — Score every artifact (your scoring function + frequency analysis)
  3. Orient — 3 AI critic agents independently analyze systemic patterns
  4. Decide — Strategist agent synthesizes all critiques, picks top 3 source code edits
  5. Act — Builder applies edits with safety checks (line-based, max 30 lines, backup)
  6. Verify — Rebuild everything, re-score. If score drops → automatic rollback

The loop stops when it hits a plateau (no improvement for N iterations) or reaches the max iteration count.

What makes it different

  • It edits your actual source code, not just configs or parameters
  • Multi-agent critique — 3 independent critics prevent tunnel vision
  • Automatic rollback — regressions are caught and reverted immediately
  • Skill memory — learned patterns persist across runs via SQLite
  • Plateau detection — stops when it can't improve further (no wasted API calls)
  • Deterministic + AI scoring — fast deterministic checks catch concrete signals, AI catches subjective quality

Quick start

npm install ooda-engine
# or clone this repo
  1. Create ooda.config.js in your project root:
module.exports = {
  name: 'My Project',

  // Files the AI can read and modify
  engineFiles: {
    'src/templates.js': { path: 'src/templates.js', role: 'Page renderer' },
    'src/prompts.js':   { path: 'src/prompts.js',   role: 'LLM prompts' },
  },

  // Artifacts to build and score each iteration
  testArtifacts: [
    { id: 'test-1', input: 'hello world' },
    { id: 'test-2', input: 'complex query' },
  ],

  // Build function — produce output from your code
  buildCommand(artifact) {
    const { render } = require('./src/templates');
    return render(artifact.input);
  },

  // Score function — rate the output 1-10
  scoreArtifact(output, artifact) {
    let score = 5;
    if (output.length > 1000) score += 2;
    if (output.includes('error')) score -= 3;
    return { score: Math.max(1, Math.min(10, score)), signals: [], issues: [] };
  },
};
  1. Set your AI provider:
cp .env.example .env
# Edit .env — add GEMINI_API_KEY or OPENAI_API_KEY
  1. Run:
node ooda.js              # Full run (100 iterations)
node ooda.js --max=10     # Quick test
node ooda.js --dry-run    # See what it would change without modifying files
node ooda.js --status     # View progress from a previous run
node ooda.js --reset      # Start fresh (archives old state)

Use cases

Domain Build function Score function
Website templates Render HTML from templates Score content completeness, SEO, accessibility
LLM prompts Run prompt, collect output Score accuracy, relevance, format compliance
Email generators Generate email HTML Score deliverability signals, CTA quality
API formatters Format sample responses Score schema compliance, completeness
Code generators Generate code from specs Score compilation, test pass rate
Data pipelines Process sample datasets Score output accuracy, coverage
Doc generators Generate docs from code Score coverage, readability, correctness

Architecture

ooda.js                  Main loop orchestrator
lib/
  ai.js                  Unified AI client (Gemini / OpenAI)
  db.js                  SQLite skill storage
  skill-memory.js        Learning agent — records patterns, injects into prompts
  reviewer.js            Multi-agent QA panel (usable standalone)
examples/
  ooda.config.js         Example configuration
  reviewer-example.js    Standalone reviewer usage

The 6 agents

# Agent Role
1 Observer Scores every artifact using your scoring function
2 Critic A Domain-specific code analysis (customizable)
3 Critic B Prompt/generation quality analysis (customizable)
4 Critic C Cross-artifact pattern analysis (customizable)
5 Strategist Synthesizes all critiques, picks top 3 code edits
6 Builder Applies edits with safety checks + rollback

Safety guarantees

  • Max 30 lines per edit — prevents large, risky rewrites
  • Automatic rollback on build failure or score regression > threshold
  • File backups before every edit
  • Module compilation check after edits (verifies require() works)
  • Plateau detection — stops after N flat iterations (default: 6)
  • Dry run mode — analyze without modifying anything

Skill memory

The engine learns from its iterations. High-confidence patterns are stored in SQLite and injected into future LLM prompts as "LEARNED PREFERENCES":

const { applySkills, recordObservation } = require('./lib/skill-memory');

// Record what works
recordObservation({
  domain: 'templates',
  pattern: 'Short headlines (under 50 chars) score 20% higher',
  recommendation: { action: 'prefer short headlines' },
});

// Inject learned patterns into any prompt
const augmentedPrompt = applySkills(myPrompt, { domain: 'templates' });

Standalone reviewer

The multi-agent review panel can be used independently:

const { createReviewPanel } = require('./lib/reviewer');

const review = createReviewPanel([
  {
    name: 'Quality Checker',
    buildPrompt: (artifact) => 'You are a QA reviewer. Score 1-10. Return JSON: { "score": N, "issues": [] }',
  },
]);

const result = await review(myArtifact);
// { pass: true, scores: { 'Quality Checker': 8 }, feedback: [] }

Configuration reference

module.exports = {
  // Required
  name: 'string',                    // Project name (shown in logs/reports)
  engineFiles: { ... },              // Files agents can read and modify
  testArtifacts: [ ... ],            // Array of artifacts to build/score
  buildCommand: (artifact) => output, // Build function
  scoreArtifact: (output, artifact) => score, // Scoring function

  // Optional
  domainContext: 'string',           // Describes your domain for the strategist
  engineDir: 'path',                 // Root dir for engine files (default: cwd)
  stateDir: 'path',                  // Where state is saved (default: _ooda/)
  maxIterations: 100,                // Or set OODA_MAX_ITERATIONS env var
  plateauLimit: 6,                   // Stop after N flat iterations
  rateLimitMs: 3500,                 // Delay between LLM calls
  regressionThreshold: -0.3,         // Rollback threshold
  verifyModules: true,               // require() check after edits

  // Customize critic agents
  critics: {
    A: { name, files, systemPrompt, buildPrompt },
    B: { name, files, systemPrompt, buildPrompt },
    C: { name, systemPrompt, buildPrompt },
  },
};

Environment variables

Variable Default Description
AI_PROVIDER gemini gemini or openai
GEMINI_API_KEY Required if using Gemini
OPENAI_API_KEY Required if using OpenAI
OODA_MAX_ITERATIONS 100 Max iterations per run
OODA_DB_PATH ./skills.db SQLite database path for skill memory

License

MIT

About

Self-improving AI engine — autonomous code improvement via multi-agent OODA loop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors