LLM judge: reference file routing quality (trigger-based vs. content-based)

## Context

From a PR review patterns audit across 7 MongoDB Agent Skills, scope gating gaps were identified in 11 instances across 5 PRs. A recurring sub-pattern: reference file descriptions in SKILL.md describe what's *in* the file rather than *when* to load it, leaving agents without clear routing signals.

## Proposed LLM judge check

Add a check that evaluates whether reference file descriptions in SKILL.md use **trigger-based** language vs. **content-based** language.

### The distinction

| Type | Example | Problem |
|------|---------|---------|
| Content-based (bad) | "Decision framework for relationships" | Tells agent what the file contains, not when to read it |
| Content-based (bad) | "Pre-calculate expensive aggregations" | Describes a pattern, not a trigger condition |
| Trigger-based (good) | "When the user needs to model a relationship between entities" | Tells agent when to load the file |
| Trigger-based (good) | "When query patterns include repeated expensive aggregations on the same data" | Observable condition the agent can evaluate |

### What the judge should evaluate

1. Locate the reference file listing/index in SKILL.md (typically a table or bulleted list)
2. For each reference file description, classify as trigger-based or content-based
3. Flag content-based descriptions with a suggestion for trigger-based alternatives

### Output format

```
Reference file routing quality:
- ✓ vector-search.md: "When the user wants to implement semantic search" (trigger-based)
- ✗ subset.md: "Subset pattern for hot/cold data separation" (content-based)
  Suggestion: "When working with documents that have frequently-accessed fields alongside rarely-accessed fields"
```

### Scoring

Report as a ratio: `{trigger-based} / {total} reference descriptions use trigger-based routing`

A skill with all content-based descriptions should score low on Scope Discipline; a skill with all trigger-based descriptions demonstrates good routing.

### Examples from PR reviews

| PR | Issue |
|----|-------|
| 7 | No routing info for agents to decide which of 30 reference files to read |
| 7 | Pattern section uses terse descriptions that describe content, not triggers |
| 7 | Quick reference descriptions describe content, not routing triggers |

## Related

Part of a series of LLM judge enhancements derived from PR review pattern analysis. Complements the existing Scope Discipline dimension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM judge: reference file routing quality (trigger-based vs. content-based) #56

Context

Proposed LLM judge check

The distinction

What the judge should evaluate

Output format

Scoring

Examples from PR reviews

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Example	Problem
Content-based (bad)	"Decision framework for relationships"	Tells agent what the file contains, not when to read it
Content-based (bad)	"Pre-calculate expensive aggregations"	Describes a pattern, not a trigger condition
Trigger-based (good)	"When the user needs to model a relationship between entities"	Tells agent when to load the file
Trigger-based (good)	"When query patterns include repeated expensive aggregations on the same data"	Observable condition the agent can evaluate

PR	Issue
7	No routing info for agents to decide which of 30 reference files to read
7	Pattern section uses terse descriptions that describe content, not triggers
7	Quick reference descriptions describe content, not routing triggers

Uh oh!

LLM judge: reference file routing quality (trigger-based vs. content-based) #56

Description

Context

Proposed LLM judge check

The distinction

What the judge should evaluate

Output format

Scoring

Examples from PR reviews

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions