Add hosted evaluations section to eval docs by xeophon · Pull Request #1040 · PrimeIntellect-ai/verifiers

xeophon · 2026-03-19T21:25:59Z

Summary

add a short Hosted Evaluations section near the top of docs/evaluation.md
link the new section from the table of contents
document the basic prime eval run --hosted flow, including publishing first, --follow, TOML config usage, and the official hosted eval guide

Testing

Not run (not requested)

Note

Low Risk
Low risk documentation-only change; no code paths or CLI behavior are modified.

Overview
Adds a new Hosted Evaluations section to docs/evaluation.md, linked from the table of contents, describing how to run prime eval run --hosted against Hub-published environments (including prime env push, --follow, and TOML config usage) and pointing to the official hosted-evals guide for hosted-only flags.

^{Written by Cursor Bugbot for commit aee2394. This will update automatically on new commits. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-19T21:27:32Z

docs/evaluation.md

+prime eval run configs/eval/benchmark-hosted.toml --hosted
+```
+
+For the full hosted workflow and hosted-only flags such as `--follow`, `--timeout-minutes`, `--allow-sandbox-access`, and `--custom-secrets`, see the official [Hosted Evaluations](https://docs.primeintellect.ai/tutorials-environments/hosted-evaluations) guide.


Missing skill update for hosted evaluations workflow

Low Severity

The PR adds a new Hosted Evaluations section to docs/evaluation.md documenting the --hosted flag, --follow, and other hosted-only flags, but skills/evaluate-environments/SKILL.md was not updated. The skill file only mentions "hosted eval workflows" in passing (line 63) and the example command there doesn't even use --hosted. The project rule requires that changes to docs/evaluation.md that affect user-facing workflows are reflected in the corresponding skill file.

^{Triggered by project rule: BugBot Instructions}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aee2394a5d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T21:29:15Z

docs/evaluation.md

+prime env push my-env
+prime eval run my-team/my-env --hosted


Keep the pushed namespace consistent with the hosted slug

This example fails if someone copies it literally: prime env push my-env publishes under the caller's own namespace, but the next command runs my-team/my-env. Publishing to a team requires an explicit --team <team> on the push, so the doc should either keep the owner the same in both commands or show the team flag; otherwise readers end up targeting a hosted slug they never created.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-19T21:29:15Z

docs/evaluation.md

+Hosted runs also support TOML configs:
+
+```bash
+prime eval run configs/eval/benchmark-hosted.toml --hosted


Explain that hosted TOML configs must use Hub slugs

The new hosted TOML example omits the main difference from the local TOML flow: hosted [[eval]].env_id entries must point at already-published Hub slugs like owner/my-env. Later in this same file, env_id is documented as an environment module name and every config example uses local IDs, so readers following this new snippet will naturally reuse gsm8k/my-env and get hosted runs that cannot resolve the environment.

Useful? React with 👍 / 👎.

add hosted evals to docs

aee2394

cursor bot reviewed Mar 19, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hosted evaluations section to eval docs#1040

Add hosted evaluations section to eval docs#1040
xeophon wants to merge 1 commit intomainfrom
add-hosted-evals-to-docs

xeophon commented Mar 19, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xeophon commented Mar 19, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 19, 2026

Choose a reason for hiding this comment

Missing skill update for hosted evaluations workflow

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xeophon commented Mar 19, 2026 •

edited by cursor bot

Loading