Skip to content

feat(indexer): baseline + overlay incremental indexing architecture#213

Merged
jafreck merged 2 commits intomainfrom
feat/incremental-index
Mar 15, 2026
Merged

feat(indexer): baseline + overlay incremental indexing architecture#213
jafreck merged 2 commits intomainfrom
feat/incremental-index

Conversation

@jafreck
Copy link
Owner

@jafreck jafreck commented Mar 15, 2026

Summary

Add incremental indexing to Lore using a baseline + overlay architecture so the index stays accurate during active editing — including multi-agent concurrent-writer scenarios — without blocking on a full SCIP rebuild after every change.

Closes #212

Design document: docs/incremental-index-design.md

Schema Changes

  • Add layer (baseline|overlay) and generation columns to all data tables (files, symbols, symbol_refs, type_refs, symbol_relationships, file_imports, annotations, symbol_metrics, external_deps)
  • Add dirty_files table tracking files with active overlay data
  • Add reverse_deps table for impact-set computation ("file X is depended on by files Y, Z")
  • Add effective_* SQL views that merge baseline + overlay layers, preferring overlay for dirty files
  • New lore_meta keys: generation, generation_pending, baseline_head_sha, overlay_head_sha
  • Helper functions: getGeneration(), incrementGeneration()

Pipeline Changes

  • PipelineContext gains layer and generation fields
  • ScipIndexerStage: only runs in baseline builds, writes layer='baseline' with generation
  • SourceIndexStage: overlay mode writes layer='overlay', preserves baseline rows; only deletes prior overlay rows
  • LspEnrichmentStage: cross-file enrichment for overlay mode only; baseline builds skip LSP for SCIP-covered languages
  • ResolutionStage: resolveSymbolEdges() accepts overlayOnly option to scope name-based resolution to overlay refs
  • New ReverseDepsStage: builds/updates reverse_deps from resolved imports + refs
  • New OverlayCleanupStage: atomic baseline promotion — deletes old generation, clears promoted overlay rows, rebuilds reverse_deps
  • IndexBuilder.baselineRebuild(): new method for background SCIP reconciliation with atomic promotion

Watcher/Poller Changes

  • Immediate flushes are overlay-only (tree-sitter + LSP enrichment, no SCIP)
  • Deferred SCIP flush replaced with baselineRebuild() after quiet period
  • scipQuietPeriodMs now controls background baseline rebuild scheduling

Resolution Taxonomy

  • Add overlay_stale resolution method for refs invalidated by file changes but not yet re-enriched

MCP Tool Freshness

  • FreshnessInfo type: { source: 'baseline'|'mixed'|'overlay', baseline_age_s, dirty_file_count }
  • getFreshness() helper in read-only.ts
  • Automatically injected into every MCP tool response via loggedHandler

Invariants

  1. Baseline is always complete — after a successful build, every file has baseline rows
  2. Overlay never deletes baseline — overlay rows coexist with baseline rows for the same file
  3. dirty_files is authoritative — a file has active overlay data iff it appears in dirty_files
  4. Generations are monotonic — lore_meta.generation only increases
  5. Cross-layer refs resolve by name — falls back to name-based lookup when callee_id doesn't exist in effective symbols

Test Results

All 1830 tests pass (98 test files, 0 failures).

jafreck added 2 commits March 15, 2026 12:32
…212)

Add incremental indexing using a baseline + overlay architecture so the
index stays accurate during active editing without blocking on a full
SCIP rebuild after every change.

Schema changes:
- Add layer (baseline|overlay) and generation columns to all data tables
- Add dirty_files table tracking files with active overlay data
- Add reverse_deps table for impact-set computation
- Add effective_* views that merge baseline + overlay layers
- Add generation, baseline_head_sha, overlay_head_sha metadata keys

Pipeline changes:
- PipelineContext gains layer and generation fields
- ScipIndexerStage: only runs in baseline builds, writes layer=baseline
- SourceIndexStage: overlay mode writes layer=overlay, preserves baseline
- LspEnrichmentStage: cross-file enrichment for overlay mode only
- ResolutionStage: scoped to overlay refs in overlay mode
- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New Rebac- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseDehes- New ReverseDepsStage: mai- New ReverseDepsStage: mai- New ReverseD r- New ReverseDepsStage: mai- Nne r- New ReverseDuietPeriodMs now controls baseline rebuild scheduling

Resolution taxonomy:Resolution taxonomy:Resolution taxonomy:Resolution taxonomy:ReesResolution taxonomy:Resolution taxonomy:Resolution_aResolution taxonomy:Resolution taxonomy:Resolution taxonomyr in read-only.ts

Design: docs/incremental-index-design.md
Add tests for baseline+overlay architecture to restore coverage above
thresholds (85% statements, 87% lines):

- overlay-cleanup.test.ts: OverlayCleanupStage baseline promotion,
  dirty_files cleanup, generation metadata, reverse_deps rebuild
- incremental-schema.test.ts: generation helpers, dirty_files table,
  reverse_deps table, effective_* views, layer/generation columns
- freshness.test.ts: getFreshness() with baseline/mixed/overlay states
- source-index-overlay.test.ts: processFile with overlay layer, dirty
  marking, baseline preservation, prior overlay cleanup
- source-index-stage-overlay.test.ts: SourceIndexStage overlay update,
  file deletion in overlay/baseline modes, stale symbol tracking
- stage-layer-guards.test.ts: ScipIndexerStage overlay skip,
  LspEnrichmentStage baseline/overlay behavior
- overlay-resolution.test.ts: resolveSymbolEdges overlayOnly option
- index.test.ts: IndexBuilder.baselineRebuild() with generation tracking
  and overlay cleanup
@jafreck jafreck merged commit 04e59f5 into main Mar 15, 2026
1 check passed
@codecov
Copy link

codecov bot commented Mar 15, 2026

Codecov Report

❌ Patch coverage is 96.92308% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.54%. Comparing base (b39a1f3) to head (9316b11).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/indexer/stages/source-index.ts 91.42% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #213      +/-   ##
==========================================
+ Coverage   87.43%   87.54%   +0.11%     
==========================================
  Files          82       84       +2     
  Lines        9429     9579     +150     
  Branches     2925     2958      +33     
==========================================
+ Hits         8244     8386     +142     
- Misses       1185     1193       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jafreck jafreck mentioned this pull request Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incremental indexing: baseline + overlay architecture

1 participant