Structural check: require eval results in testing directory

## Context

From a PR review patterns audit across 7 MongoDB Agent Skills, testing/evidence gaps appeared in 8 instances across 4 PRs. The most common review comment was some form of "have you tested agent performance with and without this file?" — a question that should be answered by eval results, not reviewer intuition.

Issue #39 added support for recognizing the `evals/` directory. This issue proposes going further: checking that eval results actually exist.

## Proposed check

Add an **opt-in warning-level** check that verifies a skill's `evals/` directory contains result files, not just test definitions.

### Behavior

- Severity: **warning** (opt-in via flag, e.g., `--require-eval-results`)
- Check: If `evals/` directory exists, verify it contains at least one file matching a results pattern (e.g., `*results*`, `*output*`, `*report*`, `*.json` with results-like structure)
- If `evals/` exists but contains only definition/config files and no results: warn
- If `evals/` does not exist: warn (when opt-in flag is set)

### Message

- `No eval results found in evals/ directory. Consider running evals and including results to demonstrate skill effectiveness.`

### Why opt-in

Not all workflows will have eval infrastructure set up. Making this opt-in (and eventually default in CI mode) allows gradual adoption.

### Examples from PR reviews

| PR | Issue |
|----|-------|
| 6 | "Have we tested agent performance with and without this file?" (asked for 3 reference files) |
| 5 | "Have you found agents unable to adhere to patterns without this guidance?" |
| 7 | "Did you find in testing that the skill needed this?" (asked twice) |
| 7 | Meta: "I wonder if there is a way to indicate what information a skill *needs* for efficacy" |

## Related

Builds on #39 (recognize `evals/` directory). Part of a series of checks derived from PR review pattern analysis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structural check: require eval results in testing directory #53

Context

Proposed check

Behavior

Message

Why opt-in

Examples from PR reviews

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PR	Issue
6	"Have we tested agent performance with and without this file?" (asked for 3 reference files)
5	"Have you found agents unable to adhere to patterns without this guidance?"
7	"Did you find in testing that the skill needed this?" (asked twice)
7	Meta: "I wonder if there is a way to indicate what information a skill needs for efficacy"

Uh oh!

Structural check: require eval results in testing directory #53

Description

Context

Proposed check

Behavior

Message

Why opt-in

Examples from PR reviews

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions