LLM judge: cross-file consistency checking (metadata, examples, impact levels)

## Context

From a PR review patterns audit across 7 MongoDB Agent Skills, consistency / contradictions were identified in 10 instances across 4 PRs. Content within a skill contradicts itself, presents different things with misleading equivalence, or examples don't align narratively.

## Proposed LLM judge check

Add a consistency check that evaluates three specific areas:

### Check 1: Metadata consistency between SKILL.md and reference files

If SKILL.md contains a summary table with impact/priority levels (e.g., "HIGH", "CRITICAL", "MEDIUM"), verify these match the impact levels stated in the individual reference files.

**Example (from PR 7):** Impact levels in SKILL.md quick reference table (all HIGH for fundamentals) were inconsistent with individual reference files (CRITICAL, MEDIUM).

### Check 2: Example narrative continuity

Within a single reference file, verify that examples in a walkthrough/tutorial use consistent field names and variable names throughout. Flag cases where an example introduces fields (e.g., `name`, `email`) but subsequent code in the same walkthrough uses different fields (e.g., `address`) without explanation.

**Example (from PR 7):** Schema drift examples use name/email but migration code uses address field.

### Check 3: "Prefer X over Y" without caveats

Flag directive statements that recommend one approach over another without noting when the less-preferred approach is actually correct.

**Example (from PR 2):** "Prefer `$ne: null` over `$exists: false`" without noting semantic difference (fields with explicit null values behave differently).

### Output format

```
Consistency issues found:
- SKILL.md line 34 says "HIGH" impact for schema-validation, but schema-validation.md says "CRITICAL"
- walkthrough.md: lines 10-25 use field "name" but lines 40-55 switch to "address" without transition
- Line 78: "Prefer X over Y" — add caveat about when Y is the correct choice
```

### Examples from PR reviews

| PR | Issue |
|----|-------|
| 7 | Impact levels in quick reference table inconsistent with reference files |
| 7 | Schema drift examples use name/email but migration code uses address field |
| 7 | Soft 1MB guideline presented with same weight as hard 16MB limit |
| 5 | Language patterns reference describes "code examples" that don't exist in the referenced file |
| 2 | "Prefer `$ne: null` over `$exists: false`" without noting semantic difference |

## Related

Part of a series of LLM judge enhancements derived from PR review pattern analysis. Requires the judge to read multiple files within a skill, so implementation should consider whether this is a separate pass or integrated into per-file scoring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM judge: cross-file consistency checking (metadata, examples, impact levels) #58

Context

Proposed LLM judge check

Check 1: Metadata consistency between SKILL.md and reference files

Check 2: Example narrative continuity

Check 3: "Prefer X over Y" without caveats

Output format

Examples from PR reviews

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PR	Issue
7	Impact levels in quick reference table inconsistent with reference files
7	Schema drift examples use name/email but migration code uses address field
7	Soft 1MB guideline presented with same weight as hard 16MB limit
5	Language patterns reference describes "code examples" that don't exist in the referenced file
2	"Prefer `$ne: null` over `$exists: false`" without noting semantic difference

Uh oh!

LLM judge: cross-file consistency checking (metadata, examples, impact levels) #58

Description

Context

Proposed LLM judge check

Check 1: Metadata consistency between SKILL.md and reference files

Check 2: Example narrative continuity

Check 3: "Prefer X over Y" without caveats

Output format

Examples from PR reviews

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions