ADR-800: Dynamic model catalog and OpenRouter support by aaronsb · Pull Request #357 · aaronsb/knowledge-graph-system

aaronsb · 2026-03-15T22:09:10Z

Summary

OpenRouter provider — new OpenRouterProvider class using OpenAI SDK with custom base URL, supporting 200+ models through a single API key
Dynamic model catalog — provider_model_catalog database table replaces hardcoded model lists; stores per-model pricing, capabilities, and operator curation state
Interactive init flow — operator.sh init now offers provider selection (OpenAI/Anthropic/OpenRouter), API key validation, catalog-driven model browsing with price sort, and configurable max_tokens
Two-phase migrations — split schema/migrations/ (cold SQL) from schema/migrations-warm/ (AGE/Cypher DDL) to fix cold-start race condition with migration 058
Catalog API endpoints — admin routes for listing, refreshing, enabling/disabling, and pricing models
Cost estimator integration — CostEstimator reads pricing from catalog table before falling back to env vars

Key files

Area	Files
Provider	`api/app/lib/ai_providers.py` (OpenRouterProvider + fetch_model_catalog on all providers)
Catalog	`api/app/lib/model_catalog.py`, `api/app/routes/models.py`
Schema	`schema/migrations/059_provider_model_catalog.sql`
Operator	`operator/configure.py` (models subcommand), `operator/lib/guided-init.sh`
Migrations	`operator/database/migrate-db.sh` (--warm, continue-on-fail), `operator/lib/start-infra.sh`
Cost	`api/app/services/job_analysis.py`

ADR

docs/architecture/ai-embeddings/ADR-800-dynamic-model-catalog-and-openrouter-support.md

Test plan

Cold operator.sh init with OpenRouter — provider selection, key validation, catalog refresh (346 models), model selection with price sort, max_tokens prompt
Document ingestion via OpenRouter/GPT-5.1 — concepts extracted successfully
MCP search against extracted concepts — working
Two-phase migrations — cold migrations apply, postgres restarts, warm migration 058 applies after AGE init
Cold init with OpenAI (regression)
Cold init with Anthropic (regression)
configure.py models list/refresh/enable/disable/default/price commands
Admin API endpoints (/admin/models/catalog/*)

Proposes replacing hardcoded model lists with a database-backed catalog per provider, adding OpenRouter as a fourth inference endpoint, and integrating per-model pricing into cost estimation.

…R-800) - Add provider_model_catalog table (migration 059) with pricing, capabilities, and curation fields per model per provider - Seed catalog with known models and pricing (migration 060) - Add OpenRouterProvider using OpenAI SDK with custom base URL - Add fetch_model_catalog() to all providers for dynamic discovery - Add model_catalog.py with upsert/query/enable/default/pricing helpers - Add admin API endpoints: GET/POST/PUT /admin/models/catalog/* - Add configure.py 'models' subcommand: list, refresh, enable, disable, default, price - Update CostEstimator to read pricing from catalog before env vars - Wire OpenRouter into provider factory, rate limiter, and configure.py - Update list_available_models() on all providers to prefer catalog

Replace hardcoded OpenAI setup in guided-init.sh with interactive flow: - Step 4: Choose provider (OpenAI, Anthropic, OpenRouter) - Step 5: Enter and validate API key - Step 6: Refresh model catalog, present filtered menu, user picks model OpenRouter shows curated subset (GPT-4o, Claude, Gemini, Llama, etc.) with option [0] to show all 200+ models. Ollama noted as post-init config. Also adds --tsv, --category, --limit flags to configure.py models list for machine-parseable output used by the init script.

Needed by OpenRouter and Ollama fetch_model_catalog() which use requests.get() directly rather than the OpenAI SDK.

Follow existing pattern — INSERT INTO public.schema_migrations at the end of each migration file.

058 uses AGE Cypher DDL (CREATE VLABEL/ELABEL) which can fail on cold start when AGE isn't fully initialized. Wrap each label creation in EXCEPTION handler so failures log a notice instead of aborting — labels get created on first use anyway. Also add ON_ERROR_STOP=1 to psql in migrate-db.sh so failed migrations don't self-register via the INSERT at the end of the file. Previously, a mid-file failure would still record the migration as applied, then the runner would exit, blocking all subsequent migrations.

Split migrations into two phases: - schema/migrations/ — standard SQL (tables, indexes, permissions) - schema/migrations-warm/ — AGE/Cypher DDL (needs running graph engine) Move 058 (precreate_graph_labels) to migrations-warm. After cold migrations, restart postgres so AGE is fully initialized, verify it's healthy, then apply warm migrations. Also: - Migration runner continues past failures instead of stopping - configure.py ai-provider handles missing catalog table gracefully - Runner supports --warm flag to select migration directory

…input The ai_extraction_config table has a CHECK constraint that only allowed openai, anthropic, ollama. Add openrouter via migration 061. Also remove -s flag from API key read so input is visible — easier to verify the key was pasted correctly.

NUMERIC(12,6) overflows on high-cost OpenRouter models (image generation, specialized). Use unconstrained NUMERIC which handles arbitrary precision.

No backwards compatibility needed — merge table creation, seed data, openrouter constraint, and NUMERIC widening into one migration.

[$] cycles through: unsorted → cheapest first → most expensive first. [0] toggles between curated and full model list. Both compose — sort applies to whichever list is currently shown.

The catalog ID is the 2nd positional arg which argparse maps to provider_name, not model_id. Fall back to provider_name for actions that expect a catalog ID.

4096 truncates extraction responses for verbose models via OpenRouter. Bump to 16384 as interim fix — proper solution is catalog-driven token limits per model.

- Add --max-tokens flag to configure.py ai-provider - Store max_tokens in ai_extraction_config table (column already existed) - OpenRouterProvider reads max_tokens from config (default 16384) - Init flow prompts for max tokens with press-enter-to-accept default - Factory passes max_tokens from database config to provider

aaronsb · 2026-03-15T23:24:36Z

Code Review: ADR-800 Dynamic Model Catalog & OpenRouter Support

PR Size: ~2500 lines across 15 files -- thorough review focused on patterns, security, and structural concerns.

What this changes: Adds a provider_model_catalog database table for runtime model discovery, implements an OpenRouter provider for 200+ model access via unified API, wires up operator init flow with interactive model selection, and introduces a two-phase (cold/warm) migration strategy.

SECURITY: API Key Exposed in Process List

Location: operator/lib/guided-init.sh:385

Problem: The API key is passed as a command-line argument to docker exec:

docker exec kg-operator python /workspace/operator/configure.py api-key "$AI_PROVIDER" --key "$AI_KEY"

This makes the key visible in ps aux output, docker inspect, and host process listings for the duration of the command. Same issue at line 561 with --max-tokens (benign) and line 308 with --password (also a concern).

Why it matters: On multi-user systems or if any monitoring tool captures process lists, the API key and admin password are transiently exposed in plaintext.

Suggestion: Pipe the key via stdin or use an environment variable passed to docker exec:

echo "$AI_KEY" | docker exec -i kg-operator python /workspace/operator/configure.py api-key "$AI_PROVIDER" --key-stdin

Or use -e flag: docker exec -e API_KEY="$AI_KEY" ... and read from env inside the script.

Severity: Medium. The exposure window is brief and this is a local operator tool, but it is worth addressing before the init flow sees wider use.

SECURITY: Admin Routes Missing Authentication

Location: api/app/routes/models.py -- all endpoints (lines 32-179)

Problem: The admin_router endpoints have no authentication or authorization checks. Compare with api/app/routes/extraction.py which uses:

from ..dependencies.auth import CurrentUser, require_role, require_permission
# ...
_: None = Depends(require_permission("extraction_config", "read"))

The models routes are registered at /admin/models/* but anyone with network access to the API can list, refresh, enable, disable, or modify pricing for catalog models.

Why it matters: An unauthenticated user could enable expensive models, change defaults, or refresh the catalog (which makes outbound API calls using stored keys).

Suggestion: Add require_permission dependencies matching the pattern in extraction.py. At minimum, read operations should require "model_catalog", "read" and mutating operations should require "model_catalog", "write".

Severity: High -- this should be addressed before merge. The /admin/models/catalog/refresh endpoint is particularly sensitive since it triggers outbound API calls with stored credentials.

QUALITY: Repeated Connection Boilerplate in Routes

Location: api/app/routes/models.py -- every endpoint (lines 39-49, 83-93, 103-115, etc.)

Problem: Every route handler repeats the same pattern:

client = AGEClient()
conn = client.pool.getconn()
try:
    # ...
finally:
    client.pool.putconn(conn)

This appears 6 times in 180 lines. It violates DRY and creates risk -- if someone forgets the finally, connections leak.

Why it matters: Connection leak risk, maintenance burden, and inconsistency (some handlers catch HTTPException and re-raise, others don't).

Suggestion: Extract a context manager or FastAPI dependency:

def get_db_connection():
    client = AGEClient()
    conn = client.pool.getconn()
    try:
        yield conn
    finally:
        client.pool.putconn(conn)

# Then in routes:
@admin_router.get("/catalog")
async def get_catalog(conn=Depends(get_db_connection), ...):
    results = list_catalog(conn, ...)

Check if other route files in this project already have such a dependency -- if so, reuse it.

Severity: Low-medium. Functional but should be cleaned up.

QUALITY: Error Detail Leaks Internal State

Location: api/app/routes/models.py:49, 97, 115, 133, 151, 179

Problem: Exception messages are passed directly to the HTTP response:

raise HTTPException(status_code=500, detail=str(e))

This can expose database connection strings, SQL errors, file paths, and stack trace fragments to API consumers.

Suggestion: Log the full error server-side (already done), but return a generic message to the client:

raise HTTPException(status_code=500, detail="Failed to list catalog")

SCHEMA: ADR Specifies NUMERIC(12,6) but Migration Uses Unconstrained NUMERIC

Location: schema/migrations/059_provider_model_catalog.sql:25-27 vs ADR-800 line 59-61

Problem: The ADR specifies NUMERIC(12, 6) for pricing columns, but the migration uses bare NUMERIC (unconstrained). Commit f9011a9e explicitly widened from constrained to unconstrained with the message "widen pricing columns."

Assessment: This is actually fine in practice -- PostgreSQL's unconstrained NUMERIC avoids truncation issues with very small per-token prices (OpenRouter prices can be extremely small fractions). The ADR should be updated to match the implementation, or a note added explaining the deviation.

Severity: Informational -- no action needed on the code, but the ADR text is now slightly misleading.

DESIGN: set_model_default Has a Race Condition

Location: api/app/lib/model_catalog.py:144-178

Problem: set_model_default performs a read-then-two-writes pattern without wrapping in a transaction:

SELECT the provider/category (line 152-157)
UPDATE to clear existing default (line 163-168)
UPDATE to set new default (line 171-176)
COMMIT (line 177)

While conn.commit() wraps all three in the implicit transaction, the partial unique index idx_catalog_default could cause issues if two concurrent requests try to set defaults for the same provider+category simultaneously -- the second request's step 3 would violate the unique index before step 2 of the first request commits.

Suggestion: This is fine for the current single-operator use pattern, but if this ever gets concurrent API access, consider adding SELECT ... FOR UPDATE on the initial read, or using a single CTE-based statement.

Severity: Low -- single-operator tool, but worth noting.

DESIGN: CostEstimator Creates DB Connections Per Call

Location: api/app/services/job_analysis.py:49-62

Problem: _get_catalog_price creates a new AGEClient and gets a connection from the pool on every cost lookup. During job analysis, this is called twice (extraction + embedding). The broader pattern in model_catalog.py functions also expects a connection to be passed in, but this helper creates its own.

Why it matters: Inconsistency -- model_catalog.py functions take conn as a parameter (good, testable), but the cost estimator creates its own connections (harder to test, pool churn).

Suggestion: Consider passing a connection through from the caller, or at least caching the client instance.

Severity: Low -- pool handles this fine at current scale.

ROBUSTNESS: guided-init.sh API Key Input Not Masked

Location: operator/lib/guided-init.sh:374

Problem: The API key is read with plain read -p, not read -s -p (silent mode). The key is visible on screen as the user types.

read -p "${AI_KEY_PROMPT}: " AI_KEY

Suggestion: Use read -s -p and echo a newline after:

read -s -p "${AI_KEY_PROMPT}: " AI_KEY
echo ""

Severity: Medium -- usability/security issue during interactive setup. (Note: I see the ADR mentions the key input was changed to "unmask" in commit 986d6726 -- if this was intentional for copy-paste UX, document the tradeoff.)

POSITIVE: OpenRouterProvider Follows Existing Patterns Well

The OpenRouterProvider class (ai_providers.py:954-1288) is well-structured:

Follows the same AIProvider ABC interface as all other providers
Properly delegates embeddings to embedding_provider (OpenRouter doesn't serve embeddings)
Uses _load_api_key with the standard fallback chain (ADR-031)
Integrates with the rate limiter (get_provider_max_retries)
The fetch_model_catalog implementation correctly converts OpenRouter's per-token pricing strings to per-1M-token numerics
MIME type detection in describe_image matches the OpenAI provider pattern

POSITIVE: Two-Phase Migration Design

The cold/warm migration split (migrate-db.sh --warm + schema/migrations-warm/) is a practical solution to the AGE extension chicken-and-egg problem. Cold migrations run pure SQL before AGE is fully initialized; warm migrations run after a PostgreSQL restart brings AGE online. The migration runner handles both phases cleanly, and start-infra.sh orchestrates the restart between them (lines 127-140).

POSITIVE: fetch_model_catalog as Non-Abstract Default

Making fetch_model_catalog a concrete method with an empty-list default (ai_providers.py:207) rather than an abstract method is the right call. It follows Open/Closed -- existing providers work unchanged, new providers opt in. Good application of Interface Segregation.

POSITIVE: Schema Design

The provider_model_catalog table design is solid:

Composite unique on (provider, model_id, category) handles the overlap case well
Partial unique index for is_default per provider+category is elegant
ON CONFLICT DO NOTHING in seed data is idempotent
upstream_provider for OpenRouter model provenance tracking

Summary

Category	Count	Action
Security (auth missing)	1	Block -- add auth before merge
Security (key exposure)	2	Fix before wider deployment
Code quality	3	Address in follow-up
Schema/design	2	Informational
Positive patterns	4	--

Recommendation: Fix the missing authentication on /admin/models/* routes before merge. The API key exposure in guided-init.sh and the input masking are secondary but should be addressed soon. Everything else is solid work -- the OpenRouter integration, catalog design, and two-phase migration are well-thought-out.

aaronsb added 14 commits March 15, 2026 15:33

docs(adr): ADR-800 dynamic model catalog and OpenRouter support

bc15563

Proposes replacing hardcoded model lists with a database-backed catalog per provider, adding OpenRouter as a fourth inference endpoint, and integrating per-model pricing into cost estimation.

fix(operator): add requests to operator container deps

f66e08e

Needed by OpenRouter and Ollama fetch_model_catalog() which use requests.get() directly rather than the OpenAI SDK.

fix(schema): add migration registration to 059 and 060

be408ba

Follow existing pattern — INSERT INTO public.schema_migrations at the end of each migration file.

fix(schema): widen pricing columns to unconstrained NUMERIC

f9011a9

NUMERIC(12,6) overflows on high-cost OpenRouter models (image generation, specialized). Use unconstrained NUMERIC which handles arbitrary precision.

refactor(schema): consolidate migrations 059-062 into single 059

1b34705

No backwards compatibility needed — merge table creation, seed data, openrouter constraint, and NUMERIC widening into one migration.

feat(operator): add price sort toggle to model selection menu

188be02

[$] cycles through: unsorted → cheapest first → most expensive first. [0] toggles between curated and full model list. Both compose — sort applies to whichever list is currently shown.

fix(operator): fix argparse mapping for models enable/disable/default

291ac94

The catalog ID is the 2nd positional arg which argparse maps to provider_name, not model_id. Fall back to provider_name for actions that expect a catalog ID.

fix(ai): increase OpenRouter max_tokens to 16384

d9c6e43

4096 truncates extraction responses for verbose models via OpenRouter. Bump to 16384 as interim fix — proper solution is catalog-driven token limits per model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-800: Dynamic model catalog and OpenRouter support#357

ADR-800: Dynamic model catalog and OpenRouter support#357
aaronsb wants to merge 14 commits intomainfrom
feature/adr-086-model-catalog

aaronsb commented Mar 15, 2026

Uh oh!

aaronsb commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aaronsb commented Mar 15, 2026

Summary

Key files

ADR

Test plan

Uh oh!

aaronsb commented Mar 15, 2026

Code Review: ADR-800 Dynamic Model Catalog & OpenRouter Support

SECURITY: API Key Exposed in Process List

SECURITY: Admin Routes Missing Authentication

QUALITY: Repeated Connection Boilerplate in Routes

QUALITY: Error Detail Leaks Internal State

SCHEMA: ADR Specifies NUMERIC(12,6) but Migration Uses Unconstrained NUMERIC

DESIGN: set_model_default Has a Race Condition

DESIGN: CostEstimator Creates DB Connections Per Call

ROBUSTNESS: guided-init.sh API Key Input Not Masked

POSITIVE: OpenRouterProvider Follows Existing Patterns Well

POSITIVE: Two-Phase Migration Design

POSITIVE: fetch_model_catalog as Non-Abstract Default

POSITIVE: Schema Design

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant