Skip to content

ci(test): committed gap-suite runner + informational smoke-parity gate#5222

Open
TheHypnoo wants to merge 5 commits into
mainfrom
chore/ci-smoke-parity-gate
Open

ci(test): committed gap-suite runner + informational smoke-parity gate#5222
TheHypnoo wants to merge 5 commits into
mainfrom
chore/ci-smoke-parity-gate

Conversation

@TheHypnoo

@TheHypnoo TheHypnoo commented Jun 15, 2026

Copy link
Copy Markdown
Member

What

  • New committed scripts/run_gap_tests.sh for the 235-test gap suite (test-files/test_gap_*.ts), replacing the out-of-repo /tmp/run_gap_tests.sh that CLAUDE.md pointed at.
  • New smoke-parity CI job that runs it. Informational (continue-on-error) for now.
  • Fixes the stale CLAUDE.md parity-status line (28 → 235 tests; /tmpscripts/run_gap_tests.sh).

Why

The gap suite — AOT-compile each test_gap_*.ts and diff byte-for-byte against node --experimental-strip-types — is the highest-signal, most deterministic test Perry has, yet it had no committed runner (CLAUDE.md pointed at /tmp) and no CI gate. The full parity job is opt-in (tag/label only) and runs node 22, so a single-feature regression can merge green.

Design

  • run_gap_tests.sh is a thin wrapper over run_parity_tests.sh --filter test_gap_, so it reuses the one canonical normalizer + skip-list + per-test output cap (seed of the single-normalizer cleanup) instead of inventing a third differ.
  • Gate semantics: no new failures vs test-parity/known_failures.json. (run_parity_tests.sh's own exit code only trips below 80% aggregate parity — far too loose to catch a single-feature regression.)
  • node 26 = the node-suite baseline oracle (the upcoming regression guard will use the same), vs the legacy parity job's node 22.
  • Uses only allow-listed actions (actions/checkout, dtolnay/rust-toolchain, Swatinem/rust-cache, actions/setup-node) — no sccache — so it is not blocked by the org action allow-list.

Rollout (intentionally staged)

  1. This PR: the job is continue-on-error (informational). The first runs tell us which gap tests currently fail on the Linux image under node 26.
  2. Triage those into known_failures.json (or fix them).
  3. Follow-up: drop continue-on-error and add smoke-parity to the branch-protection required checks.

Part of a tiered-CI cost-reduction effort. No source changes; no existing job altered.

Summary by CodeRabbit

Release Notes

  • Tests

    • Expanded the TypeScript “gap” suite from 28 to 235 tests for wider coverage.
    • Added a CI smoke-parity run that executes the gap-suite and gates on newly detected, untriaged parity/compile gaps.
    • Introduced a dedicated gap-suite runner script that reports and surfaces only new untriaged failures.
  • Documentation

    • Updated the TypeScript Parity Status guidance to reflect the new suite size, the recommended run command, and the updated gating behavior.

Add scripts/run_gap_tests.sh — a committed runner for the gap suite
(test-files/test_gap_*.ts, 235 tests), replacing the out-of-repo
/tmp/run_gap_tests.sh that CLAUDE.md pointed at. It is a thin wrapper over
run_parity_tests.sh --filter test_gap_, so it reuses the one canonical
normalizer, skip-list, and output cap (seed of the single-normalizer work),
and gates on "no NEW failures vs known_failures.json" rather than
run_parity_tests.sh's loose <80%-aggregate exit.

Wire it into a new smoke-parity CI job. The gap suite is the highest-signal-
per-second test Perry has and had no committed runner and no PR gate — a
single-feature regression could merge green. The job is INFORMATIONAL for now
(continue-on-error): the first runs surface which gap tests fail on the Linux
image under node 26 so they can be triaged into known_failures.json; once
curated + green, a follow-up drops continue-on-error and branch protection
makes it required.

Uses node 26 (the node-suite baseline oracle) and only allow-listed actions
(no sccache), so it is not blocked by the org action allow-list.

Also fix the stale CLAUDE.md parity-status line (28 -> 235 tests, /tmp ->
scripts/run_gap_tests.sh).

No source changes; no existing job altered.
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f43f1922-8aa3-4f9b-bb7f-b155ff8f5fa3

📥 Commits

Reviewing files that changed from the base of the PR and between 5a309ab and 101ce8d.

📒 Files selected for processing (1)
  • scripts/run_gap_tests.sh

📝 Walkthrough

Walkthrough

Adds scripts/run_gap_tests.sh, a Bash script that runs the test_gap_* parity subset via run_parity_tests.sh, extracts failures from the JSON report, diffs them against a known-failures list, and exits nonzero on untriaged failures. A new smoke-parity CI job calls this script with continue-on-error: true. CLAUDE.md is updated to reference the new command and the expanded 235-test count.

Changes

Gap test smoke gate

Layer / File(s) Summary
Gap test runner and triage gate
scripts/run_gap_tests.sh
New script invokes run_parity_tests.sh --filter test_gap_, reads test-parity/reports/latest.json to collect parity and compile failures, diffs them against test-parity/known_failures.json, and exits 1 with a printed list when untriaged failures are found; exits 0 with "Gap gate OK" otherwise.
CI job wiring and docs update
.github/workflows/test.yml, CLAUDE.md
Adds the smoke-parity GitHub Actions job (continue-on-error: true, 60-min timeout, Node.js 26) that runs ./scripts/run_gap_tests.sh; updates CLAUDE.md to reflect 235 tracked tests and the committed script as the run command.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • PerryTS/perry#5116: Adds the preceding gc-stress job in the GitHub Actions workflow; the new smoke-parity job is positioned directly before it in the test pipeline.

Poem

🐇 Hop, hop! The gap tests now run in CI lane,
A Bash script sniffs failures, untriaged ones get named.
continue-on-error keeps the pipeline sane,
Known failures are skipped — no false alarms remain.
The rabbit checks the gate: "Gap gate OK!" it exclaims. ✅

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding a committed gap-suite runner script and a new informational smoke-parity CI job.
Description check ✅ Passed The description comprehensively covers the what, why, and design decisions, with detailed context on rollout strategy and gate semantics.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/ci-smoke-parity-gate

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test.yml:
- Around line 284-295: Replace all mutable version tags with full commit SHAs
for the four GitHub Actions used in the smoke-parity job in the test.yml
workflow file. Update actions/checkout from `@v6` to its commit SHA,
dtolnay/rust-toolchain from `@stable` to its commit SHA, Swatinem/rust-cache from
`@v2` to its commit SHA, and actions/setup-node from `@v6` to its commit SHA. Each
action should be specified in the format owner/repo@<full-commit-sha> to ensure
reproducibility and prevent supply-chain drift.

In `@scripts/run_gap_tests.sh`:
- Around line 50-60: The script uses fixed temporary filenames like
/tmp/gap_all_fails.txt, /tmp/gap_known.txt, and /tmp/gap_new.txt which can
conflict when multiple script instances run concurrently. Replace all three
fixed temp filenames with dynamically generated unique names using mktemp or by
incorporating a process-specific identifier (such as $$) into each filename.
Update the assignments where these temp files are created from $REPORT, $KNOWN,
and the comm command output, and ensure the wc -l command references the correct
updated variable name for /tmp/gap_all_fails.txt.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 301df9eb-f909-4158-8e87-a5913c9fbe0e

📥 Commits

Reviewing files that changed from the base of the PR and between 4d405b4 and 4262e2e.

📒 Files selected for processing (3)
  • .github/workflows/test.yml
  • CLAUDE.md
  • scripts/run_gap_tests.sh

Comment on lines +284 to +295
- uses: actions/checkout@v6

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable

- uses: Swatinem/rust-cache@v2
with:
shared-key: "${{ runner.os }}-perry"
save-if: ${{ github.ref == 'refs/heads/main' }}

- name: Setup Node.js
uses: actions/setup-node@v6

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify unpinned action refs in workflow files.
# Expected after fix: no matches.
rg -nP '^\s*-\s*uses:\s*[^@\s]+@(?![0-9a-fA-F]{40}\b).+$' .github/workflows/*.yml

Repository: PerryTS/perry

Length of output: 4338


🏁 Script executed:

# Extract the job context around lines 279-300 to verify if this is a new smoke-parity job
sed -n '279,300p' .github/workflows/test.yml | cat -n

# Also check if all actions in the snippet (284-295) are unpinned
echo "--- Checking lines 284-295 for unpinned actions ---"
sed -n '284,295p' .github/workflows/test.yml | cat -n

Repository: PerryTS/perry

Length of output: 1241


Pin all GitHub Actions in the smoke-parity job to commit SHAs.

Four actions use mutable version tags instead of commit SHAs:

  • actions/checkout@v6 (line 284)
  • dtolnay/rust-toolchain@stable (line 289)
  • Swatinem/rust-cache@v2 (line 291)
  • actions/setup-node@v6 (line 294)

Pinning to full commit SHAs is required to prevent supply-chain drift and ensure policy compliance.

🧰 Tools
🪛 zizmor (1.25.2)

[warning] 284-284: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 284-284: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 287-287: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 289-289: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 295-295: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 289-289: runtime artifacts potentially vulnerable to a cache poisoning attack (cache-poisoning): enables caching by default

(cache-poisoning)


[error] 295-295: runtime artifacts potentially vulnerable to a cache poisoning attack (cache-poisoning): enables caching by default

(cache-poisoning)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 284 - 295, Replace all mutable
version tags with full commit SHAs for the four GitHub Actions used in the
smoke-parity job in the test.yml workflow file. Update actions/checkout from `@v6`
to its commit SHA, dtolnay/rust-toolchain from `@stable` to its commit SHA,
Swatinem/rust-cache from `@v2` to its commit SHA, and actions/setup-node from `@v6`
to its commit SHA. Each action should be specified in the format
owner/repo@<full-commit-sha> to ensure reproducibility and prevent supply-chain
drift.

Source: Linters/SAST tools

Comment thread scripts/run_gap_tests.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/test.yml (1)

124-126: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin mozilla-actions/sccache-action to a full commit SHA.

Line 125, Line 206, and Line 402 use a mutable tag (@v0.0.10). For CI supply-chain integrity and reproducibility, this needs owner/repo@<40-char-sha> pinning.

Suggested change
-      - name: Start sccache
-        uses: mozilla-actions/sccache-action@v0.0.10
+      - name: Start sccache
+        uses: mozilla-actions/sccache-action@<full-commit-sha>

Also applies to: 205-207, 401-403

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 124 - 126, The
mozilla-actions/sccache-action action is pinned to a mutable version tag
(`@v0.0.10`) at three locations in the workflow file, which compromises
reproducibility and supply-chain security. Replace the mutable tag `@v0.0.10` with
a full 40-character commit SHA for the mozilla-actions/sccache-action action at
all three affected locations: .github/workflows/test.yml lines 125, 205-207, and
401-403. Use the same commit SHA across all three locations to ensure consistent
behavior.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In @.github/workflows/test.yml:
- Around line 124-126: The mozilla-actions/sccache-action action is pinned to a
mutable version tag (`@v0.0.10`) at three locations in the workflow file, which
compromises reproducibility and supply-chain security. Replace the mutable tag
`@v0.0.10` with a full 40-character commit SHA for the
mozilla-actions/sccache-action action at all three affected locations:
.github/workflows/test.yml lines 125, 205-207, and 401-403. Use the same commit
SHA across all three locations to ensure consistent behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7070c0c8-861a-47aa-ba29-e1f0a32ed83b

📥 Commits

Reviewing files that changed from the base of the PR and between 4262e2e and 5a309ab.

📒 Files selected for processing (2)
  • .github/workflows/test.yml
  • CLAUDE.md
✅ Files skipped from review due to trivial changes (1)
  • CLAUDE.md

@TheHypnoo

Copy link
Copy Markdown
Member Author

Good catch — fixed in the latest commit. run_gap_tests.sh now allocates a run-scoped mktemp -d dir and cleans it up via trap ... EXIT, so concurrent runs can't clobber each other's failure lists.

TheHypnoo and others added 2 commits June 16, 2026 12:27
run_gap_tests.sh wrote its failure lists to fixed /tmp/gap_*.txt names, so
concurrent runs (a second PR, local + CI on the same box, or the upcoming
node-suite-guard alongside) could clobber each other and produce a false
gate result. Allocate a run-scoped dir with mktemp -d and rm -rf it on EXIT.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant