test: expand sanity suite with branding, deny, driver, and resilience tests#494
test: expand sanity suite with branding, deny, driver, and resilience tests#494
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRenumbers and expands resilience tests to 10 steps, adds yolo-deny and no-internet checks, extends smoke tests (MongoDB, dbt-discover, check-command), augments install verification (semver, branding leak, driver resolvability), updates PR test generation, and adds MongoDB to Docker/CI. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/phases/resilience.sh`:
- Around line 175-189: This test ("No internet — graceful error") should be
skipped when ANTHROPIC_API_KEY is not set to avoid re-testing the missing-key
path; add a guard at the start of the block that checks ANTHROPIC_API_KEY and if
empty prints a skip message and increments the same SKIP_COUNT (or prints
PASS/skip consistent with other tests) then exits the block, so the
NO_NET_OUTPUT/altimate run logic only runs when ANTHROPIC_API_KEY is present.
- Around line 126-159: The test writes the deny config to the wrong location and
accepts generic model refusals as a pass; update the script so the generated
config is written to "$DENY_CONFIG_DIR/altimate-code/opencode.jsonc" (create the
altimate-code subdir before writing) so the loader will find it (use the
existing DENY_CONFIG_DIR variable), and tighten the assertion logic around
DENY_OUTPUT: only treat output containing explicit permission enforcement tokens
like "denied", "blocked", "not allowed", or "permission.*deny" as PASS and treat
any other responses (including generic refusals or rephrases) as FAIL; adjust
the grep branches (the current if/elif/else that inspects DENY_OUTPUT)
accordingly so the else no longer counts as a PASS.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 155-175: test_dbt_discover mutates the shared WORKDIR and ignores
failures from altimate_run_with_turns; modify it to create and cd into an
isolated temp directory (e.g., mktemp -d) instead of cd "$WORKDIR", run the dbt
project creation and git commands inside that temp dir, and ensure you return to
the original directory (or use pushd/popd). Capture the exit code of
altimate_run_with_turns (called in test_dbt_discover) rather than using "||
true" and write "FAIL" to "$RESULTS_DIR/dbt-discover" when the command fails or
the output contains error substrings; otherwise write "PASS". Keep references to
test_dbt_discover, altimate_run_with_turns, WORKDIR, RESULTS_DIR, and get_output
to locate the changes.
- Around line 178-187: The test invokes altimate check incorrectly with a --file
flag and masks failures with "|| true"; in test_check_command replace "timeout
30 altimate check --file check_target.sql 2>&1 || true" with a positional file
invocation "timeout 30 altimate check check_target.sql 2>&1" (remove the --file
and the "|| true") so the command uses the correct positional argument and
non-zero failures aren't suppressed; update the call in the test_check_command
function that writes check_target.sql and leaves logging to RESULTS_DIR as
before.
In `@test/sanity/phases/verify-install.sh`:
- Around line 57-65: The current capture of HELP_OUTPUT uses "|| echo """ which
masks failures of the altimate help commands; change the logic so HELP_OUTPUT is
assigned from the command output (e.g., HELP_OUTPUT=$(altimate --help 2>&1))
without the fallback, then check the command exit status and treat a non-zero
exit as a failure (increment FAIL_COUNT and print an error) before running the
branding grep against HELP_OUTPUT; apply the same change to the altimate run
--help block (the block around lines 70-78) so command failures aren't silenced
and only successful output is scanned for branding leaks.
- Around line 44-52: The semver check currently uses grep -qE
'^[0-9]+\.[0-9]+\.[0-9]+' which only matches a prefix; update the pattern used
in the conditional that checks VERSION_CLEAN so it anchors the end (e.g., change
to '^[0-9]+\.[0-9]+\.[0-9]+$') or replace it with a full semver regex if
pre/post-release metadata should be allowed; modify the grep invocation around
the VERSION_CLEAN variable (the conditional that contains grep -qE) accordingly.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 88-90: The dbt-discover PR test is emitted without provisioning a
sample dbt project, so auto-discovery won't be exercised; update the emit_test
call for "dbt-discover" so it first creates a minimal dbt project (add
dbt_project.yml and a models/test_model.sql) or invoke the existing scaffold
used by test/sanity/phases/smoke-tests.sh before running altimate, ensuring the
repository contains those files when emit_test "dbt-discover" (and the related
command string) is executed; reference the emit_test "dbt-discover" entry in
generate.sh and the dbt project scaffold used by smoke-tests.sh to mirror the
same setup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 194fd102-82ec-4295-99cf-ab43ad4fafa5
📒 Files selected for processing (4)
test/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/pr-tests/generate.sh
| # 10. Version matches semver format (X.Y.Z) — catches #212 regressions | ||
| VERSION_CLEAN=$(echo "$VERSION" | head -1 | tr -d '[:space:]') | ||
| if echo "$VERSION_CLEAN" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+'; then | ||
| echo " PASS: version is semver ($VERSION_CLEAN)" | ||
| PASS_COUNT=$((PASS_COUNT + 1)) | ||
| else | ||
| echo " FAIL: version is not semver (got '$VERSION_CLEAN')" | ||
| FAIL_COUNT=$((FAIL_COUNT + 1)) | ||
| fi |
There was a problem hiding this comment.
Anchor the semver match.
^[0-9]+\.[0-9]+\.[0-9]+ is only a prefix check, so values like 1.2.3-dev or 1.2.3foo still pass. If this step is meant to enforce X.Y.Z, end the pattern with $; otherwise use an explicit semver regex.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/sanity/phases/verify-install.sh` around lines 44 - 52, The semver check
currently uses grep -qE '^[0-9]+\.[0-9]+\.[0-9]+' which only matches a prefix;
update the pattern used in the conditional that checks VERSION_CLEAN so it
anchors the end (e.g., change to '^[0-9]+\.[0-9]+\.[0-9]+$') or replace it with
a full semver regex if pre/post-release metadata should be allowed; modify the
grep invocation around the VERSION_CLEAN variable (the conditional that contains
grep -qE) accordingly.
| HELP_OUTPUT=$(altimate --help 2>&1 || echo "") | ||
| BRANDING_LEAKS=$(echo "$HELP_OUTPUT" | grep -iE 'opencode' | grep -ivE '\.opencode|opencode\.json[c]?|@opencode-ai|opencode\.local|OPENCODE_' || true) | ||
| if [ -z "$BRANDING_LEAKS" ]; then | ||
| echo " PASS: --help has no upstream branding leaks" | ||
| PASS_COUNT=$((PASS_COUNT + 1)) | ||
| else | ||
| echo " FAIL: --help contains upstream branding:" | ||
| echo "$BRANDING_LEAKS" | head -5 | sed 's/^/ /' | ||
| FAIL_COUNT=$((FAIL_COUNT + 1)) |
There was a problem hiding this comment.
Don't turn a broken help command into a passing leak check.
|| echo "" makes both help captures look clean even when altimate --help or altimate run --help exits non-zero. Fail the step on the command error first, then run the branding grep on the captured output.
Also applies to: 70-78
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/sanity/phases/verify-install.sh` around lines 57 - 65, The current
capture of HELP_OUTPUT uses "|| echo """ which masks failures of the altimate
help commands; change the logic so HELP_OUTPUT is assigned from the command
output (e.g., HELP_OUTPUT=$(altimate --help 2>&1)) without the fallback, then
check the command exit status and treat a non-zero exit as a failure (increment
FAIL_COUNT and print an error) before running the branding grep against
HELP_OUTPUT; apply the same change to the altimate run --help block (the block
around lines 70-78) so command failures aren't silenced and only successful
output is scanned for branding leaks.
685d5c0 to
a3c2f9a
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
test/sanity/phases/resilience.sh (2)
180-194:⚠️ Potential issue | 🟡 MinorSkip this block when no API key is configured.
If
ANTHROPIC_API_KEYis empty, this re-tests the missing-key path from step 9 and can PASS without ever exercising the#181no-internet branch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 180 - 194, Add a guard before the "No internet graceful handling" test block that checks the ANTHROPIC_API_KEY environment variable and skips the entire block if it's empty; specifically, before invoking altimate run (the NO_NET_OUTPUT block) check [ -z "$ANTHROPIC_API_KEY" ] and print a SKIP message and increment PASS_COUNT or leave counters unchanged as your test suite expects, so the missing-key path from step 9 isn't retested and the `#181` no-internet branch is actually exercised only when an API key is present.
149-162:⚠️ Potential issue | 🟠 MajorFail ambiguous deny-test outcomes.
The final
elsestill records PASS when the model self-refuses or the CLI fails before permission evaluation. That means this step can greenlight#372/#377without proving the deny rule actually fired.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 149 - 162, The final fallback branch currently records a PASS when neither the marker file ($DENY_MARKER) exists nor the deny-indicative text appears in $DENY_OUTPUT, which creates false positives; change the final else in the deny check to treat this ambiguous outcome as a FAIL by logging a clear "AMBIGUOUS: deny rule not proven (marker absent and no deny message)" message and incrementing FAIL_COUNT instead of PASS_COUNT so that only explicit deny evidence (marker present or deny text in $DENY_OUTPUT) records a PASS; adjust messages touched around DENY_MARKER, DENY_OUTPUT, PASS_COUNT, and FAIL_COUNT accordingly.test/sanity/phases/smoke-tests.sh (1)
174-179:⚠️ Potential issue | 🟠 MajorTreat
TIMEOUTas a faileddbt-discoverrun.
altimate_run_with_turnswritesTIMEOUTwhen exit code 124 occurs, but this block only greps for crash strings. A timed-out discovery can still be recorded as PASS.🐛 Proposed fix
- SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" || true - local output=$(get_output "dbt-discover") - if echo "$output" | grep -qi "unhandled\|TypeError\|Cannot read"; then + local output status + SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" + status=$? + output=$(get_output "dbt-discover") + if [ $status -ne 0 ] || echo "$output" | grep -qi "TIMEOUT\|unhandled\|TypeError\|Cannot read"; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 174 - 179, The test currently only checks for crash strings in the dbt-discover output, so timed-out runs (when altimate_run_with_turns writes "TIMEOUT") can be marked PASS; update the post-run check in the dbt-discover block to treat a "TIMEOUT" result as a failure by also matching "TIMEOUT" (or exit code 124 if you prefer) in the captured output from get_output; change the conditional that uses echo "$output" | grep -qi "unhandled\|TypeError\|Cannot read" to include "TIMEOUT" (e.g., "unhandled\|TypeError\|Cannot read\|TIMEOUT") so that the script writes FAIL to RESULTS_DIR/dbt-discover when a timeout occurred.test/sanity/pr-tests/generate.sh (1)
88-90:⚠️ Potential issue | 🟠 MajorProvision a dbt project before emitting
dbt-discover.This command runs in the repo as-is, so most PRs will not contain
dbt_project.ymlplus a model and therefore will not exercise auto-discovery.test/sanity/phases/smoke-tests.shalready builds the minimal scaffold intest_dbt_discover.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` around lines 88 - 90, The dbt-discover test is emitted without provisioning a minimal dbt project, so update the generate.sh flow to create the scaffold used by smoke-tests before calling emit_test: detect the dbt change block and, prior to emit_test "dbt-discover", create or copy the minimal scaffold (dbt_project.yml and at least one model) into the test_dbt_discover directory — you can reuse the same setup steps or invoke the setup logic from test/sanity/phases/smoke-tests.sh — then call emit_test "dbt-discover" "altimate run --max-turns 2 --yolo 'discover dbt project config'".
🧹 Nitpick comments (1)
test/sanity/pr-tests/generate.sh (1)
78-85: Make these PR-specific entries invoke focused checks.
yolo-denyandbranding-checkboth dispatch whole phase scripts, so unrelated resilience/install failures can dominate the signal from the regression you were trying to isolate. A tiny wrapper or subtest selector would keep these manifest entries focused.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` around lines 78 - 85, The PR entries "yolo-deny" and "branding-check" currently invoke entire phase scripts via emit_test ("yolo-deny" -> "$SCRIPT_DIR/phases/resilience.sh", "branding-check" -> "$SCRIPT_DIR/phases/verify-install.sh"), causing noisy unrelated failures; change them to run focused checks by having emit_test call small wrapper scripts or the phase scripts with a subtest selector argument (e.g., run "$SCRIPT_DIR/phases/resilience.sh deny-only" or a new "$SCRIPT_DIR/phases/yolo-deny.sh", and similarly a "$SCRIPT_DIR/phases/branding-check.sh" or "verify-install.sh branding-only") so the generated tests execute only the specific regression subtests rather than full phases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/phases/resilience.sh`:
- Around line 131-146: The heredoc for opencode.jsonc references ${DENY_MARKER}
before it's defined, which breaks under set -u; move the DENY_MARKER assignment
so it is defined (e.g., set DENY_MARKER="$DENY_CONFIG_DIR/deny-marker") before
the heredoc that writes to "$DENY_CONFIG_DIR/altimate-code/opencode.jsonc" so
the ${DENY_MARKER} expansion succeeds; keep the rest of the mkdir and heredoc
intact and ensure DENY_CONFIG_DIR is still created first.
---
Duplicate comments:
In `@test/sanity/phases/resilience.sh`:
- Around line 180-194: Add a guard before the "No internet graceful handling"
test block that checks the ANTHROPIC_API_KEY environment variable and skips the
entire block if it's empty; specifically, before invoking altimate run (the
NO_NET_OUTPUT block) check [ -z "$ANTHROPIC_API_KEY" ] and print a SKIP message
and increment PASS_COUNT or leave counters unchanged as your test suite expects,
so the missing-key path from step 9 isn't retested and the `#181` no-internet
branch is actually exercised only when an API key is present.
- Around line 149-162: The final fallback branch currently records a PASS when
neither the marker file ($DENY_MARKER) exists nor the deny-indicative text
appears in $DENY_OUTPUT, which creates false positives; change the final else in
the deny check to treat this ambiguous outcome as a FAIL by logging a clear
"AMBIGUOUS: deny rule not proven (marker absent and no deny message)" message
and incrementing FAIL_COUNT instead of PASS_COUNT so that only explicit deny
evidence (marker present or deny text in $DENY_OUTPUT) records a PASS; adjust
messages touched around DENY_MARKER, DENY_OUTPUT, PASS_COUNT, and FAIL_COUNT
accordingly.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 174-179: The test currently only checks for crash strings in the
dbt-discover output, so timed-out runs (when altimate_run_with_turns writes
"TIMEOUT") can be marked PASS; update the post-run check in the dbt-discover
block to treat a "TIMEOUT" result as a failure by also matching "TIMEOUT" (or
exit code 124 if you prefer) in the captured output from get_output; change the
conditional that uses echo "$output" | grep -qi "unhandled\|TypeError\|Cannot
read" to include "TIMEOUT" (e.g., "unhandled\|TypeError\|Cannot read\|TIMEOUT")
so that the script writes FAIL to RESULTS_DIR/dbt-discover when a timeout
occurred.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 88-90: The dbt-discover test is emitted without provisioning a
minimal dbt project, so update the generate.sh flow to create the scaffold used
by smoke-tests before calling emit_test: detect the dbt change block and, prior
to emit_test "dbt-discover", create or copy the minimal scaffold
(dbt_project.yml and at least one model) into the test_dbt_discover directory —
you can reuse the same setup steps or invoke the setup logic from
test/sanity/phases/smoke-tests.sh — then call emit_test "dbt-discover" "altimate
run --max-turns 2 --yolo 'discover dbt project config'".
---
Nitpick comments:
In `@test/sanity/pr-tests/generate.sh`:
- Around line 78-85: The PR entries "yolo-deny" and "branding-check" currently
invoke entire phase scripts via emit_test ("yolo-deny" ->
"$SCRIPT_DIR/phases/resilience.sh", "branding-check" ->
"$SCRIPT_DIR/phases/verify-install.sh"), causing noisy unrelated failures;
change them to run focused checks by having emit_test call small wrapper scripts
or the phase scripts with a subtest selector argument (e.g., run
"$SCRIPT_DIR/phases/resilience.sh deny-only" or a new
"$SCRIPT_DIR/phases/yolo-deny.sh", and similarly a
"$SCRIPT_DIR/phases/branding-check.sh" or "verify-install.sh branding-only") so
the generated tests execute only the specific regression subtests rather than
full phases.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d51210b6-5621-4b8c-b02e-8d84f0bc2277
📒 Files selected for processing (3)
test/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/pr-tests/generate.sh
bebcf5a to
ac95c91
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (4)
test/sanity/phases/verify-install.sh (2)
57-58:⚠️ Potential issue | 🟠 MajorDon't treat a broken help path as a clean branding check.
Both help probes still use
|| echo "", so a non-zeroaltimate --help/altimate run --helpis turned into empty, leak-free output instead of failing the install verification step.Also applies to: 70-71
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/verify-install.sh` around lines 57 - 58, The check is swallowing failures because HELP_OUTPUT and the other help capture use "|| echo """, turning a non-zero exit from "altimate --help" (and the duplicate "altimate run --help" at lines 70-71) into an empty, leak-free string; remove the "|| echo """/similar fallback so the command's non-zero exit propagates and the install verification fails on broken help paths (i.e., update the HELP_OUTPUT assignments for "altimate --help" and "altimate run --help" to capture output without appending "|| echo \"\"").
44-46:⚠️ Potential issue | 🟡 MinorAnchor the semver regex.
^[0-9]+\.[0-9]+\.[0-9]+is still only a prefix match, so values like1.2.3-devor1.2.3foowill pass this regression check.Minimal fix
-if echo "$VERSION_CLEAN" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+'; then +if echo "$VERSION_CLEAN" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/verify-install.sh` around lines 44 - 46, The semver check uses VERSION_CLEAN piped into a grep -qE in the if condition and currently only matches a prefix, allowing values like 1.2.3-dev; update that grep pattern to require the end-of-string anchor so it only accepts exact X.Y.Z versions (modify the pattern used in the if statement that calls grep -qE with VERSION_CLEAN).test/sanity/phases/smoke-tests.sh (2)
189-191:⚠️ Potential issue | 🟠 Major
dbt-discovertimeouts can still report PASS.
altimate_run_with_turnswritesTIMEOUTon exit 124, but this block suppresses the status and only looks forunhandled|TypeError|Cannot read. A hung discovery run therefore falls through to PASS.Minimal fix
- if echo "$output" | grep -qi "unhandled\|TypeError\|Cannot read"; then + if echo "$output" | grep -qi "TIMEOUT\|unhandled\|TypeError\|Cannot read"; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 189 - 191, The current block runs SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" ... || true which discards the exit status, so TIMEOUT (exit 124) can be treated as PASS; change the logic in the dbt-discover invocation to capture and inspect the command exit code (the return from altimate_run_with_turns) and/or the output for the literal "TIMEOUT", and fail the test if the exit code equals 124 or output contains "TIMEOUT"; specifically update the block that calls altimate_run_with_turns "dbt-discover" and the subsequent use of get_output "dbt-discover" (and variable output) so that timeout conditions produce a non-PASS result instead of being suppressed.
203-204:⚠️ Potential issue | 🟠 MajorDon't blanket-ignore
timeoutintest_check_command.
timeout 30 ... || truediscards exit 124 and any other fatal status, so hangs or runner/usage failures can still be recorded as PASS unless stderr happens to match the current grep. Ifaltimate checkintentionally uses a non-zero exit for findings, whitelist that status explicitly instead of swallowing every non-zero.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 203 - 204, Do not blanket-ignore the command exit status; instead run the check capturing both output and exit code (replace the current subshell that uses "|| true"), e.g., run "timeout 30 altimate check check_target.sql" into a variable "output" and immediately capture "$?" into "exit_code", then only treat specific, expected non-zero codes as allowed (whitelist the exit code used by altimate for findings) while treating 124 (timeout) and other unexpected non-zero values as failures; update the subsequent logic (the grep that checks "$output") to run only after this explicit exit_code handling so timeouts/runner errors are reported as failures rather than swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 189-191: The current block runs SANITY_TIMEOUT=90
altimate_run_with_turns "dbt-discover" ... || true which discards the exit
status, so TIMEOUT (exit 124) can be treated as PASS; change the logic in the
dbt-discover invocation to capture and inspect the command exit code (the return
from altimate_run_with_turns) and/or the output for the literal "TIMEOUT", and
fail the test if the exit code equals 124 or output contains "TIMEOUT";
specifically update the block that calls altimate_run_with_turns "dbt-discover"
and the subsequent use of get_output "dbt-discover" (and variable output) so
that timeout conditions produce a non-PASS result instead of being suppressed.
- Around line 203-204: Do not blanket-ignore the command exit status; instead
run the check capturing both output and exit code (replace the current subshell
that uses "|| true"), e.g., run "timeout 30 altimate check check_target.sql"
into a variable "output" and immediately capture "$?" into "exit_code", then
only treat specific, expected non-zero codes as allowed (whitelist the exit code
used by altimate for findings) while treating 124 (timeout) and other unexpected
non-zero values as failures; update the subsequent logic (the grep that checks
"$output") to run only after this explicit exit_code handling so timeouts/runner
errors are reported as failures rather than swallowed.
In `@test/sanity/phases/verify-install.sh`:
- Around line 57-58: The check is swallowing failures because HELP_OUTPUT and
the other help capture use "|| echo """, turning a non-zero exit from "altimate
--help" (and the duplicate "altimate run --help" at lines 70-71) into an empty,
leak-free string; remove the "|| echo """/similar fallback so the command's
non-zero exit propagates and the install verification fails on broken help paths
(i.e., update the HELP_OUTPUT assignments for "altimate --help" and "altimate
run --help" to capture output without appending "|| echo \"\"").
- Around line 44-46: The semver check uses VERSION_CLEAN piped into a grep -qE
in the if condition and currently only matches a prefix, allowing values like
1.2.3-dev; update that grep pattern to require the end-of-string anchor so it
only accepts exact X.Y.Z versions (modify the pattern used in the if statement
that calls grep -qE with VERSION_CLEAN).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 93b7bc62-dfb2-47c7-9503-c79bdf8989d2
📒 Files selected for processing (7)
test/sanity/Dockerfiletest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/pr-tests/generate.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- test/sanity/pr-tests/generate.sh
- test/sanity/phases/resilience.sh
ac95c91 to
fa157c0
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (5)
test/sanity/phases/smoke-tests.sh (2)
213-219:⚠️ Potential issue | 🟠 MajorStop suppressing failures in the new dbt/check smoke paths.
Line 213 still ends the
altimate_run_with_turnscall with|| true, and Line 227 does the same fortimeout 30 altimate check .... That means timeouts, usage errors, and other non-zero exits can still be recorded asPASSif the output misses the current grep patterns.Suggested fix
- SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" || true - local output=$(get_output "dbt-discover") - if echo "$output" | grep -qi "unhandled\|TypeError\|Cannot read"; then + local status output + SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" + status=$? + output=$(get_output "dbt-discover") + if [ $status -ne 0 ] || echo "$output" | grep -qi "TIMEOUT\|unhandled\|TypeError\|Cannot read"; then echo "FAIL" > "$RESULTS_DIR/dbt-discover" else echo "PASS" > "$RESULTS_DIR/dbt-discover" fi @@ - local output=$(timeout 30 altimate check check_target.sql 2>&1 || true) - if echo "$output" | grep -qi "TypeError\|unhandled\|Cannot read properties"; then + local status output + output=$(timeout 30 altimate check check_target.sql 2>&1) + status=$? + if [ $status -ne 0 ] || echo "$output" | grep -qi "TypeError\|unhandled\|Cannot read properties"; then echo "FAIL" > "$RESULTS_DIR/check-cmd" else echo "PASS" > "$RESULTS_DIR/check-cmd" fiAlso applies to: 223-232
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 213 - 219, The smoke test is suppressing failures by appending "|| true" to the altimate invocations, so remove the "|| true" suffix from the altimate_run_with_turns "dbt-discover" invocation (and the "timeout 30 altimate check ..." invocation in the dbt/check section) so their non-zero exit codes propagate; keep the existing get_output and output-grepping logic but let altimate_run_with_turns and timeout+altimate return failures (so failures are written as FAIL instead of masked as PASS).
141-147:⚠️ Potential issue | 🟠 MajorFail the MongoDB smoke test on non-zero CLI exits.
altimate_runreturns the command exit code (test/sanity/lib/altimate-run.sh line 24), but the calling code discards it immediately. A connection, authentication, or other failure that doesn't produce output containingTIMEOUT,unhandled, ordriver not installedwill incorrectly writePASS.Suggested fix
- altimate_run "mongodb" "run a find command on the admin database against mongodb at ${mongo_host}:${mongo_port}" - local output=$(get_output "mongodb") - if echo "$output" | grep -qi "TIMEOUT\|unhandled\|driver not installed"; then + local status output + altimate_run "mongodb" "run a find command on the admin database against mongodb at ${mongo_host}:${mongo_port}" + status=$? + output=$(get_output "mongodb") + if [ $status -ne 0 ] || echo "$output" | grep -qi "TIMEOUT\|unhandled\|driver not installed"; then echo "FAIL" > "$RESULTS_DIR/mongodb" else echo "PASS" > "$RESULTS_DIR/mongodb" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 141 - 147, The current MongoDB smoke test calls altimate_run ("altimate_run" against "${mongo_host}:${mongo_port}") but ignores its exit status, so failures without matching output still pass; modify the test to capture altimate_run's exit code immediately (e.g., run altimate_run then save "$?" into a variable) and treat any non-zero exit as a FAIL by writing "FAIL" to "$RESULTS_DIR/mongodb"; only proceed to inspect get_output("mongodb") for the textual failure patterns when the exit code is zero.test/sanity/phases/verify-install.sh (2)
57-66:⚠️ Potential issue | 🟠 MajorDon't let a broken help command count as a clean branding scan.
Lines 57 and 70 still use
|| echo "", soaltimate --help/altimate run --helpcan exit non-zero and this phase will still pass as long as the error output lacksopencode. Fail on the command status first, then run the leak grep only on successful output.Also applies to: 70-79
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/verify-install.sh` around lines 57 - 66, The help-output capture currently swallows failures by using "|| echo \"\"", so a non-zero exit from "altimate --help" or "altimate run --help" can still be treated as a pass; change the logic in the HELP_OUTPUT/BRANDING_LEAKS checks to first run the help command and check its exit status (capture output into HELP_OUTPUT, test $? or use an if command; e.g., run "altimate --help 2>&1" and if it exits non-zero increment FAIL_COUNT and print a failure about the help command), and only when the help command succeeds run the grep/BRANDING_LEAKS logic (keep using BRANDING_LEAKS to search the captured HELP_OUTPUT). Ensure the same fix is applied to both the HELP_OUTPUT block and the analogous "altimate run --help" block.
44-52:⚠️ Potential issue | 🟡 MinorAnchor the semver regex.
Line 46 is still only a prefix check, so values like
1.2.3-devor1.2.3foowill pass even though this step says it enforcesX.Y.Z.Suggested fix
-if echo "$VERSION_CLEAN" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+'; then +if echo "$VERSION_CLEAN" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/verify-install.sh` around lines 44 - 52, The semver check currently uses grep -qE '^[0-9]+\.[0-9]+\.[0-9]+' which only matches a prefix; update the pattern used when testing VERSION_CLEAN so it anchors the end as well (use ^...$) to require exactly X.Y.Z (no extra suffixes like -dev or trailing chars). Locate the block referencing VERSION_CLEAN and the grep invocation (the if echo "$VERSION_CLEAN" | grep -qE ...) and replace the regex with a fully anchored pattern that enforces three numeric dot-separated components.test/sanity/pr-tests/generate.sh (1)
88-90:⚠️ Potential issue | 🟠 MajorProvision a dbt project before emitting this PR test.
Line 90 runs
altimatein the repository as-is, so most PRs will only exercise the “no dbt project” path and never validate auto-discovery.test/sanity/phases/smoke-tests.shLines 194-221 already build the minimaldbt_project.yml+models/test_model.sqlscaffold this check needs; please mirror that setup here or call a helper that does it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` around lines 88 - 90, The test invokes "altimate" to auto-discover a dbt project but doesn't provision one first, so update the generate.sh before calling emit_test "dbt-discover" to create the minimal dbt scaffold (create a dbt_project.yml and models/test_model.sql with a simple SELECT) or call the existing setup helper used in smoke-tests.sh (e.g., the dbt scaffold function) so the discovery path is exercised; ensure the scaffold files are created in the repo workspace visible to altimate and remove or clean them after the test if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 108-112: The heredoc building ALTIMATE_CODE_CONN_SNOWFLAKE_TEST
interpolates raw credentials into JSON which breaks if values contain quotes or
backslashes; instead construct the JSON using jq to safely encode fields (use jq
-n with --arg for SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD and
defaults for SNOWFLAKE_WAREHOUSE/SNOWFLAKE_ROLE) and assign its output to
ALTIMATE_CODE_CONN_SNOWFLAKE_TEST so the resulting value is valid, properly
escaped JSON.
---
Duplicate comments:
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 213-219: The smoke test is suppressing failures by appending "||
true" to the altimate invocations, so remove the "|| true" suffix from the
altimate_run_with_turns "dbt-discover" invocation (and the "timeout 30 altimate
check ..." invocation in the dbt/check section) so their non-zero exit codes
propagate; keep the existing get_output and output-grepping logic but let
altimate_run_with_turns and timeout+altimate return failures (so failures are
written as FAIL instead of masked as PASS).
- Around line 141-147: The current MongoDB smoke test calls altimate_run
("altimate_run" against "${mongo_host}:${mongo_port}") but ignores its exit
status, so failures without matching output still pass; modify the test to
capture altimate_run's exit code immediately (e.g., run altimate_run then save
"$?" into a variable) and treat any non-zero exit as a FAIL by writing "FAIL" to
"$RESULTS_DIR/mongodb"; only proceed to inspect get_output("mongodb") for the
textual failure patterns when the exit code is zero.
In `@test/sanity/phases/verify-install.sh`:
- Around line 57-66: The help-output capture currently swallows failures by
using "|| echo \"\"", so a non-zero exit from "altimate --help" or "altimate run
--help" can still be treated as a pass; change the logic in the
HELP_OUTPUT/BRANDING_LEAKS checks to first run the help command and check its
exit status (capture output into HELP_OUTPUT, test $? or use an if command;
e.g., run "altimate --help 2>&1" and if it exits non-zero increment FAIL_COUNT
and print a failure about the help command), and only when the help command
succeeds run the grep/BRANDING_LEAKS logic (keep using BRANDING_LEAKS to search
the captured HELP_OUTPUT). Ensure the same fix is applied to both the
HELP_OUTPUT block and the analogous "altimate run --help" block.
- Around line 44-52: The semver check currently uses grep -qE
'^[0-9]+\.[0-9]+\.[0-9]+' which only matches a prefix; update the pattern used
when testing VERSION_CLEAN so it anchors the end as well (use ^...$) to require
exactly X.Y.Z (no extra suffixes like -dev or trailing chars). Locate the block
referencing VERSION_CLEAN and the grep invocation (the if echo "$VERSION_CLEAN"
| grep -qE ...) and replace the regex with a fully anchored pattern that
enforces three numeric dot-separated components.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 88-90: The test invokes "altimate" to auto-discover a dbt project
but doesn't provision one first, so update the generate.sh before calling
emit_test "dbt-discover" to create the minimal dbt scaffold (create a
dbt_project.yml and models/test_model.sql with a simple SELECT) or call the
existing setup helper used in smoke-tests.sh (e.g., the dbt scaffold function)
so the discovery path is exercised; ensure the scaffold files are created in the
repo workspace visible to altimate and remove or clean them after the test if
necessary.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 710b29fa-2a6f-420b-92ce-cc3dfc3aa70e
📒 Files selected for processing (7)
test/sanity/Dockerfiletest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/pr-tests/generate.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- test/sanity/Dockerfile
- test/sanity/phases/resilience.sh
| if [ -n "${SNOWFLAKE_ACCOUNT:-}" ] && [ -n "${SNOWFLAKE_USER:-}" ] && [ -n "${SNOWFLAKE_PASSWORD:-}" ]; then | ||
| export ALTIMATE_CODE_CONN_SNOWFLAKE_TEST=$(cat <<SFEOF | ||
| {"type":"snowflake","account":"${SNOWFLAKE_ACCOUNT}","user":"${SNOWFLAKE_USER}","password":"${SNOWFLAKE_PASSWORD}","warehouse":"${SNOWFLAKE_WAREHOUSE:-COMPUTE_WH}","role":"${SNOWFLAKE_ROLE:-PUBLIC}"} | ||
| SFEOF | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n test/sanity/phases/smoke-tests.sh | sed -n '100,120p'Repository: AltimateAI/altimate-code
Length of output: 1091
🏁 Script executed:
# Check if Node.js is available in the test environment
which node || echo "Node not found"Repository: AltimateAI/altimate-code
Length of output: 143
🏁 Script executed:
# Look for similar JSON encoding patterns or utilities in the test infrastructure
fd -e sh -e bash test/sanity | head -20Repository: AltimateAI/altimate-code
Length of output: 417
🏁 Script executed:
# Search for JSON encoding or jq usage in test files
rg -l "jq|JSON|json" test/sanity --type sh | head -10Repository: AltimateAI/altimate-code
Length of output: 254
🏁 Script executed:
# Check if jq is available
which jq 2>/dev/null && echo "jq found" || echo "jq not found"Repository: AltimateAI/altimate-code
Length of output: 80
🏁 Script executed:
# Look at how ALTIMATE_CODE_CONN_SNOWFLAKE_TEST is used downstream
rg "ALTIMATE_CODE_CONN_SNOWFLAKE_TEST|SNOWFLAKE_ACCOUNT|SNOWFLAKE_USER" test/sanity --type sh -A 2 -B 2Repository: AltimateAI/altimate-code
Length of output: 1068
🏁 Script executed:
# Check if there are other JSON encoding patterns in the test infrastructure
rg "cat <<" test/sanity --type sh -A 3 | head -50Repository: AltimateAI/altimate-code
Length of output: 478
🏁 Script executed:
# Look at how MongoDB or other sensitive configs are handled in similar tests
rg "export.*\$(cat" test/sanity --type sh -B 2 -A 3Repository: AltimateAI/altimate-code
Length of output: 141
🏁 Script executed:
# Fix regex - check for cat <<EOF patterns
rg 'export.*\$\(cat' test/sanity --type sh -B 2 -A 3Repository: AltimateAI/altimate-code
Length of output: 1066
🏁 Script executed:
# Check how Snowflake credentials are typically formatted - look at other parts of smoke-tests
cat -n test/sanity/phases/smoke-tests.sh | sed -n '1,50p'Repository: AltimateAI/altimate-code
Length of output: 2140
🏁 Script executed:
# Check if there are any credential handling utilities in the test library
cat test/sanity/lib/altimate-run.sh | head -30Repository: AltimateAI/altimate-code
Length of output: 809
🏁 Script executed:
# Verify how the JSON string is actually passed - check if there's any escaping or quoting
grep -n "ALTIMATE_CODE_CONN" test/sanity/phases/smoke-tests.shRepository: AltimateAI/altimate-code
Length of output: 269
🏁 Script executed:
# Check for Python in the environment as an alternative to Node
which python3 2>/dev/null && echo "python3 found" || echo "python3 not found"Repository: AltimateAI/altimate-code
Length of output: 85
🏁 Script executed:
# Check what tools are available in the test environment
ls /usr/bin | grep -E "bash|sh|sed|awk|grep|cat|jq|node|python" | head -20Repository: AltimateAI/altimate-code
Length of output: 225
🏁 Script executed:
# Look for any credential encoding utilities in the codebase
rg "escape|encode|quote" test/sanity --type shRepository: AltimateAI/altimate-code
Length of output: 2538
🏁 Script executed:
# Check if there's bash parameter expansion for escaping
rg "\\\\|\\\${" test/sanity --type sh | head -10Repository: AltimateAI/altimate-code
Length of output: 143
🏁 Script executed:
# Look at the full context of how ALTIMATE_CODE_CONN_SNOWFLAKE_TEST is used
cat -n test/sanity/phases/smoke-tests.sh | sed -n '103,125p'Repository: AltimateAI/altimate-code
Length of output: 1253
🏁 Script executed:
# Check if sed/awk are used for JSON-like transformations elsewhere
rg "sed.*JSON|awk.*JSON|JSON.*sed|JSON.*awk" --type shRepository: AltimateAI/altimate-code
Length of output: 50
Use jq to properly encode Snowflake credentials as JSON.
Lines 109–110 interpolate raw credential values directly into a JSON string. Any credential containing " or \ produces invalid JSON and causes this smoke test to fail before reaching Snowflake.
Suggested fix
- export ALTIMATE_CODE_CONN_SNOWFLAKE_TEST=$(cat <<SFEOF
-{"type":"snowflake","account":"${SNOWFLAKE_ACCOUNT}","user":"${SNOWFLAKE_USER}","password":"${SNOWFLAKE_PASSWORD}","warehouse":"${SNOWFLAKE_WAREHOUSE:-COMPUTE_WH}","role":"${SNOWFLAKE_ROLE:-PUBLIC}"}
-SFEOF
- )
+ export ALTIMATE_CODE_CONN_SNOWFLAKE_TEST=$(jq -n \
+ --arg account "$SNOWFLAKE_ACCOUNT" \
+ --arg user "$SNOWFLAKE_USER" \
+ --arg password "$SNOWFLAKE_PASSWORD" \
+ --arg warehouse "${SNOWFLAKE_WAREHOUSE:-COMPUTE_WH}" \
+ --arg role "${SNOWFLAKE_ROLE:-PUBLIC}" \
+ '{type: "snowflake", account: $account, user: $user, password: $password, warehouse: $warehouse, role: $role}' \
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [ -n "${SNOWFLAKE_ACCOUNT:-}" ] && [ -n "${SNOWFLAKE_USER:-}" ] && [ -n "${SNOWFLAKE_PASSWORD:-}" ]; then | |
| export ALTIMATE_CODE_CONN_SNOWFLAKE_TEST=$(cat <<SFEOF | |
| {"type":"snowflake","account":"${SNOWFLAKE_ACCOUNT}","user":"${SNOWFLAKE_USER}","password":"${SNOWFLAKE_PASSWORD}","warehouse":"${SNOWFLAKE_WAREHOUSE:-COMPUTE_WH}","role":"${SNOWFLAKE_ROLE:-PUBLIC}"} | |
| SFEOF | |
| ) | |
| if [ -n "${SNOWFLAKE_ACCOUNT:-}" ] && [ -n "${SNOWFLAKE_USER:-}" ] && [ -n "${SNOWFLAKE_PASSWORD:-}" ]; then | |
| export ALTIMATE_CODE_CONN_SNOWFLAKE_TEST=$(jq -n \ | |
| --arg account "$SNOWFLAKE_ACCOUNT" \ | |
| --arg user "$SNOWFLAKE_USER" \ | |
| --arg password "$SNOWFLAKE_PASSWORD" \ | |
| --arg warehouse "${SNOWFLAKE_WAREHOUSE:-COMPUTE_WH}" \ | |
| --arg role "${SNOWFLAKE_ROLE:-PUBLIC}" \ | |
| '{type: "snowflake", account: $account, user: $user, password: $password, warehouse: $warehouse, role: $role}' \ | |
| ) |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 109-109: Declare and assign separately to avoid masking return values.
(SC2155)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/sanity/phases/smoke-tests.sh` around lines 108 - 112, The heredoc
building ALTIMATE_CODE_CONN_SNOWFLAKE_TEST interpolates raw credentials into
JSON which breaks if values contain quotes or backslashes; instead construct the
JSON using jq to safely encode fields (use jq -n with --arg for
SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD and defaults for
SNOWFLAKE_WAREHOUSE/SNOWFLAKE_ROLE) and assign its output to
ALTIMATE_CODE_CONN_SNOWFLAKE_TEST so the resulting value is valid, properly
escaped JSON.
a6dcd1c to
9c89f85
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/sanity/phases/resilience.sh (1)
95-121:⚠️ Potential issue | 🟠 MajorIsolate these fixture configs from the caller’s real XDG config.
This block writes to
${XDG_CONFIG_HOME:-$HOME/.config}/altimate-code/opencode.jsonand then removes it. Running the sanity phase locally can clobber a real user config and leak fixture state across runs. Please use a tempXDG_CONFIG_HOMEfor these scenarios, or at least backup/restore any existing file.Suggested change
-CONFIG_DIR="${XDG_CONFIG_HOME:-$HOME/.config}/altimate-code" +TEST_XDG_CONFIG_HOME=$(mktemp -d /tmp/sanity-config-XXXXXX) +CONFIG_DIR="$TEST_XDG_CONFIG_HOME/altimate-code" mkdir -p "$CONFIG_DIR" if [ -f "$SCRIPT_DIR/fixtures/old-config.json" ]; then cp "$SCRIPT_DIR/fixtures/old-config.json" "$CONFIG_DIR/opencode.json" if [ -n "${ANTHROPIC_API_KEY:-}" ]; then - altimate_run "old-config" "say hello" || true + XDG_CONFIG_HOME="$TEST_XDG_CONFIG_HOME" altimate_run "old-config" "say hello" || true assert_not_contains "$(get_output old-config)" "parse error" "old config loads without parse error" else skip_test "Config backwards compat" "no ANTHROPIC_API_KEY" fi rm -f "$CONFIG_DIR/opencode.json" fi ... cp "$SCRIPT_DIR/fixtures/broken-config.json" "$CONFIG_DIR/opencode.json" if [ -n "${ANTHROPIC_API_KEY:-}" ]; then - altimate_run "broken-config" "say hello" || true + XDG_CONFIG_HOME="$TEST_XDG_CONFIG_HOME" altimate_run "broken-config" "say hello" || true assert_not_contains "$(get_output broken-config)" "stack trace" "broken config handled gracefully" else - OUTPUT=$(timeout 10 altimate run --max-turns 1 --yolo "hello" 2>&1 || true) + OUTPUT=$(XDG_CONFIG_HOME="$TEST_XDG_CONFIG_HOME" timeout 10 altimate run --max-turns 1 --yolo "hello" 2>&1 || true) assert_not_contains "$OUTPUT" "SyntaxError" "broken config no SyntaxError" fi rm -f "$CONFIG_DIR/opencode.json" fi +rm -rf "$TEST_XDG_CONFIG_HOME"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 95 - 121, This code writes fixture configs directly into CONFIG_DIR (derived from XDG_CONFIG_HOME) which can clobber a real user config; fix by creating and exporting a temporary XDG_CONFIG_HOME for the test block (e.g. TMP_XDG=$(mktemp -d) && export XDG_CONFIG_HOME="$TMP_XDG") before computing CONFIG_DIR so CONFIG_DIR points into the temp dir, ensure you set a trap to cleanup the temp dir and restore any original XDG_CONFIG_HOME after the block, and then run the existing cp/altimate_run/rm steps against that CONFIG_DIR; alternatively, if you prefer to preserve an existing file, detect and back up "$CONFIG_DIR/opencode.json" before copying and restore it in the cleanup/trap — reference CONFIG_DIR, XDG_CONFIG_HOME, and the cp/rm usage in this file to locate where to apply the change.
♻️ Duplicate comments (5)
test/sanity/phases/resilience.sh (2)
180-194:⚠️ Potential issue | 🟡 MinorSkip the no-internet case when no provider key is configured.
If
ANTHROPIC_API_KEYwas empty on entry, this step just repeats the missing-key path from Lines 169-178 and never validates network-failure handling. Guard it with the same skip pattern used elsewhere.Suggested change
echo " [10/10] No internet graceful handling..." -# Use an unreachable DNS to simulate no internet (timeout quickly) -NO_NET_OUTPUT=$(timeout 15 env https_proxy=http://192.0.2.1:1 http_proxy=http://192.0.2.1:1 \ - altimate run --max-turns 1 --yolo "hello" 2>&1 || true) -assert_not_contains "$NO_NET_OUTPUT" "TypeError" "no TypeError without internet" -assert_not_contains "$NO_NET_OUTPUT" "Cannot read properties" "no unhandled error without internet" -# Should get some kind of connection/auth error, not a blank hang -if [ -z "$NO_NET_OUTPUT" ]; then - echo " FAIL: no output at all without internet (blank screen)" - FAIL_COUNT=$((FAIL_COUNT + 1)) +if [ -z "${ANTHROPIC_API_KEY:-}" ]; then + skip_test "No internet graceful handling" "no ANTHROPIC_API_KEY" else - echo " PASS: produced output without internet ($(echo "$NO_NET_OUTPUT" | wc -l) lines)" - PASS_COUNT=$((PASS_COUNT + 1)) + # Use an unreachable DNS to simulate no internet (timeout quickly) + NO_NET_OUTPUT=$(timeout 15 env https_proxy=http://192.0.2.1:1 http_proxy=http://192.0.2.1:1 \ + altimate run --max-turns 1 --yolo "hello" 2>&1 || true) + assert_not_contains "$NO_NET_OUTPUT" "TypeError" "no TypeError without internet" + assert_not_contains "$NO_NET_OUTPUT" "Cannot read properties" "no unhandled error without internet" + if [ -z "$NO_NET_OUTPUT" ]; then + echo " FAIL: no output at all without internet (blank screen)" + FAIL_COUNT=$((FAIL_COUNT + 1)) + else + echo " PASS: produced output without internet ($(echo "$NO_NET_OUTPUT" | wc -l) lines)" + PASS_COUNT=$((PASS_COUNT + 1)) + fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 180 - 194, The no-internet test block runs even when ANTHROPIC_API_KEY is unset, duplicating the missing-key flow instead of exercising network-failure behavior; wrap the whole block that sets NO_NET_OUTPUT and calls assert_not_contains/echo PASS/FAIL in the same guard used earlier (check [ -z "$ANTHROPIC_API_KEY" ]), and if empty skip the test (increment SKIP_COUNT or print a skip message), otherwise run the existing NO_NET_OUTPUT logic so the test only validates network failure when a provider key is configured.
147-162:⚠️ Potential issue | 🟠 MajorRequire the permission-layer deny marker before counting this as PASS.
packages/opencode/src/cli/cmd/run.tsproves deny enforcement withyolo mode: BLOCKED by deny rule:. The current logic still passes on genericdenied/blocked/not allowedtext, and the finalelsepasses on any other non-empty output, so model self-refusals can still false-pass this test.Suggested change
- elif echo "$DENY_OUTPUT" | grep -qi "denied\|blocked\|BLOCKED by deny rule\|not allowed"; then + elif echo "$DENY_OUTPUT" | grep -Eqi "yolo mode: BLOCKED by deny rule:|permission.*deny"; then echo " PASS: yolo deny rule explicitly blocked command" PASS_COUNT=$((PASS_COUNT + 1)) elif [ -z "$DENY_OUTPUT" ]; then echo " FAIL: no output from deny enforcement test" FAIL_COUNT=$((FAIL_COUNT + 1)) else - # Model may have refused on its own — marker absent so still safe - echo " PASS: yolo deny rule (command not executed, marker absent)" - PASS_COUNT=$((PASS_COUNT + 1)) + echo " FAIL: command was not executed, but deny enforcement was not observed" + FAIL_COUNT=$((FAIL_COUNT + 1)) fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 147 - 162, The test currently treats any generic refusal text as a PASS; require the permission-layer deny indicator instead: change the grep that checks DENY_OUTPUT to look only for the exact permission-layer phrase (case-insensitive) "BLOCKED by deny rule" (the same string used in packages/opencode/src/cli/cmd/run.ts), and only increment PASS_COUNT when that exact phrase is found; for any other non-empty DENY_OUTPUT (model self-refusal) convert the current final else branch to a FAIL (increment FAIL_COUNT) so only an explicit permission-layer deny yields PASS; update references to DENY_OUTPUT and DENY_MARKER accordingly.test/sanity/phases/smoke-tests.sh (3)
227-231:⚠️ Potential issue | 🟠 MajorPreserve the
altimate checkexit code.
timeout 30 altimate check check_target.sql 2>&1 || trueturns timeouts and other non-zero exits into a PASS unless the stderr text happens to match the grep. This smoke test should fail on any non-zero status.Suggested fix
- local output=$(timeout 30 altimate check check_target.sql 2>&1 || true) - if echo "$output" | grep -qi "TypeError\|unhandled\|Cannot read properties"; then + local output status + output=$(timeout 30 altimate check check_target.sql 2>&1) + status=$? + if [ $status -ne 0 ] || echo "$output" | grep -qi "TypeError\|unhandled\|Cannot read properties"; then echo "FAIL" > "$RESULTS_DIR/check-cmd" else echo "PASS" > "$RESULTS_DIR/check-cmd" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 227 - 231, The current command masks the exit code by using "|| true" when running "timeout 30 altimate check check_target.sql", so non-zero exits (including timeouts) are treated as PASS; remove the "|| true" and preserve the command's exit status by capturing both output and its exit code: run "timeout 30 altimate check check_target.sql" while redirecting stdout/stderr into the "output" variable (or a temp file), capture "$?" immediately into a variable (e.g., "rc"), then if rc is non-zero write "FAIL" to "$RESULTS_DIR/check-cmd"; otherwise perform the existing grep on the captured "output" to decide PASS/FAIL, and reference the "altimate check" invocation, the "output" variable, and "$RESULTS_DIR/check-cmd" when making this change.
213-218:⚠️ Potential issue | 🟠 MajorDon't record PASS when
dbt-discoverexits non-zero.In
test/sanity/lib/altimate-run.sh, Lines 27-42,altimate_run_with_turnsalready returns a failure status and writesTIMEOUTon exit 124, but|| truediscards that signal here. Any timeout or other CLI failure that doesn't include those three substrings currently lands on PASS.Suggested fix
- SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" || true - local output=$(get_output "dbt-discover") - if echo "$output" | grep -qi "unhandled\|TypeError\|Cannot read"; then + local status output + SANITY_TIMEOUT=90 altimate_run_with_turns "dbt-discover" 2 "discover dbt project config in this repo" + status=$? + output=$(get_output "dbt-discover") + if [ $status -ne 0 ] || echo "$output" | grep -qi "TIMEOUT\|unhandled\|TypeError\|Cannot read"; then echo "FAIL" > "$RESULTS_DIR/dbt-discover" else echo "PASS" > "$RESULTS_DIR/dbt-discover" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 213 - 218, The test currently ignores the exit status of altimate_run_with_turns by using "|| true", causing non-zero exits (timeouts/failures) to be marked PASS if output lacks the three error substrings; modify the dbt-discover invocation to capture and check the command's exit code (call to altimate_run_with_turns) and only write "PASS" to RESULTS_DIR/dbt-discover when the exit code is 0 and the output does not match error patterns; otherwise write "FAIL" (and preserve TIMEOUT semantics from altimate-run.sh if exit 124). Reference symbols: altimate_run_with_turns, get_output, dbt-discover, RESULTS_DIR.
109-112:⚠️ Potential issue | 🟠 MajorBuild the Snowflake test config with JSON-safe escaping.
Interpolating raw credential values into this heredoc breaks the JSON as soon as any field contains
"or\, so the smoke test can fail before it ever reaches Snowflake. Please constructALTIMATE_CODE_CONN_SNOWFLAKE_TESTwith a JSON-aware encoder instead of string interpolation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 109 - 112, The heredoc building ALTIMATE_CODE_CONN_SNOWFLAKE_TEST injects raw credential values which breaks JSON when fields contain quotes or backslashes; replace the multiline string interpolation with a JSON-aware encoder (e.g., use jq, python -c "import json,sys; print(json.dumps({...}))", or node/printf JSON tooling) to construct the object and assign it to ALTIMATE_CODE_CONN_SNOWFLAKE_TEST so each of SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD, SNOWFLAKE_WAREHOUSE, and SNOWFLAKE_ROLE are passed as individual arguments and properly escaped; keep the same JSON keys ("type":"snowflake", "account", "user", "password", "warehouse", "role") and ensure the result is a single-line JSON string suitable for export.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@test/sanity/phases/resilience.sh`:
- Around line 95-121: This code writes fixture configs directly into CONFIG_DIR
(derived from XDG_CONFIG_HOME) which can clobber a real user config; fix by
creating and exporting a temporary XDG_CONFIG_HOME for the test block (e.g.
TMP_XDG=$(mktemp -d) && export XDG_CONFIG_HOME="$TMP_XDG") before computing
CONFIG_DIR so CONFIG_DIR points into the temp dir, ensure you set a trap to
cleanup the temp dir and restore any original XDG_CONFIG_HOME after the block,
and then run the existing cp/altimate_run/rm steps against that CONFIG_DIR;
alternatively, if you prefer to preserve an existing file, detect and back up
"$CONFIG_DIR/opencode.json" before copying and restore it in the cleanup/trap —
reference CONFIG_DIR, XDG_CONFIG_HOME, and the cp/rm usage in this file to
locate where to apply the change.
---
Duplicate comments:
In `@test/sanity/phases/resilience.sh`:
- Around line 180-194: The no-internet test block runs even when
ANTHROPIC_API_KEY is unset, duplicating the missing-key flow instead of
exercising network-failure behavior; wrap the whole block that sets
NO_NET_OUTPUT and calls assert_not_contains/echo PASS/FAIL in the same guard
used earlier (check [ -z "$ANTHROPIC_API_KEY" ]), and if empty skip the test
(increment SKIP_COUNT or print a skip message), otherwise run the existing
NO_NET_OUTPUT logic so the test only validates network failure when a provider
key is configured.
- Around line 147-162: The test currently treats any generic refusal text as a
PASS; require the permission-layer deny indicator instead: change the grep that
checks DENY_OUTPUT to look only for the exact permission-layer phrase
(case-insensitive) "BLOCKED by deny rule" (the same string used in
packages/opencode/src/cli/cmd/run.ts), and only increment PASS_COUNT when that
exact phrase is found; for any other non-empty DENY_OUTPUT (model self-refusal)
convert the current final else branch to a FAIL (increment FAIL_COUNT) so only
an explicit permission-layer deny yields PASS; update references to DENY_OUTPUT
and DENY_MARKER accordingly.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 227-231: The current command masks the exit code by using "||
true" when running "timeout 30 altimate check check_target.sql", so non-zero
exits (including timeouts) are treated as PASS; remove the "|| true" and
preserve the command's exit status by capturing both output and its exit code:
run "timeout 30 altimate check check_target.sql" while redirecting stdout/stderr
into the "output" variable (or a temp file), capture "$?" immediately into a
variable (e.g., "rc"), then if rc is non-zero write "FAIL" to
"$RESULTS_DIR/check-cmd"; otherwise perform the existing grep on the captured
"output" to decide PASS/FAIL, and reference the "altimate check" invocation, the
"output" variable, and "$RESULTS_DIR/check-cmd" when making this change.
- Around line 213-218: The test currently ignores the exit status of
altimate_run_with_turns by using "|| true", causing non-zero exits
(timeouts/failures) to be marked PASS if output lacks the three error
substrings; modify the dbt-discover invocation to capture and check the
command's exit code (call to altimate_run_with_turns) and only write "PASS" to
RESULTS_DIR/dbt-discover when the exit code is 0 and the output does not match
error patterns; otherwise write "FAIL" (and preserve TIMEOUT semantics from
altimate-run.sh if exit 124). Reference symbols: altimate_run_with_turns,
get_output, dbt-discover, RESULTS_DIR.
- Around line 109-112: The heredoc building ALTIMATE_CODE_CONN_SNOWFLAKE_TEST
injects raw credential values which breaks JSON when fields contain quotes or
backslashes; replace the multiline string interpolation with a JSON-aware
encoder (e.g., use jq, python -c "import json,sys; print(json.dumps({...}))", or
node/printf JSON tooling) to construct the object and assign it to
ALTIMATE_CODE_CONN_SNOWFLAKE_TEST so each of SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER,
SNOWFLAKE_PASSWORD, SNOWFLAKE_WAREHOUSE, and SNOWFLAKE_ROLE are passed as
individual arguments and properly escaped; keep the same JSON keys
("type":"snowflake", "account", "user", "password", "warehouse", "role") and
ensure the result is a single-line JSON string suitable for export.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c2d602cf-54e7-4f9d-be54-cb6fba35dabd
📒 Files selected for processing (7)
test/sanity/Dockerfiletest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/pr-tests/generate.sh
✅ Files skipped from review due to trivial changes (1)
- test/sanity/pr-tests/generate.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- test/sanity/ci-local.sh
- test/sanity/Dockerfile
- test/sanity/docker-compose.yml
…d resilience tests Add 22 new sanity tests across all phases: verify-install.sh: - Semver version format check (#212) - --help branding leak detection for "opencode" strings (#416, #417) - "altimate run --help" branding leak detection - 9 driver resolvability checks: pg, snowflake-sdk, mysql2, mssql, duckdb, mongodb, bigquery, databricks, oracledb (#295) resilience.sh (reorganized from 8 to 10 tests): - Yolo deny enforcement using observable side-effect marker file, with config written to correct altimate-code/ subdir (#372, #377) - Missing API key graceful handling - No-internet graceful error via TEST-NET-1 proxy (#181) smoke-tests.sh: - dbt config discovery in isolated tmpdir (#448, #270) - altimate check command with positional file arg (#453) - MongoDB smoke test alongside existing postgres/snowflake Infrastructure: - Dockerfile: install all 9 driver packages (not just altimate-core) - docker-compose.yml: add MongoDB 7 service with healthcheck - ci-local.sh: add mongodb to Docker services, bump healthy count to 5, add mongodb E2E test step - pr-tests/generate.sh: add triggers for permission/yolo, branding, dbt, and driver file changes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9c89f85 to
2c1aa16
Compare
Summary
altimate checkcommand (feat: addaltimate-code checkCLI command for deterministic SQL checks #453), MongoDB smoke test (feat: add MongoDB driver support #482)Changes
verify-install.shresilience.shsmoke-tests.shDockerfiledocker-compose.ymlci-local.shpr-tests/generate.shReview
9-model code review completed (Claude, GPT 5.4 Codex, Gemini 3.1 Pro, Kimi K2.5, Grok 4, MiniMax M2.7, GLM-5, Qwen 3.5 Plus, MiMo V2 Pro). All 5 critical/major findings from the review have been addressed in this squashed commit.
Test plan
bash -nsyntax check passes on all 7 modified filesdocker compose configvalidates docker-compose.yml--help🤖 Generated with Claude Code
Summary by CodeRabbit