Skip to content

fix(integration): clear residual greptile findings on the L3 gate#817

Merged
Yiminnn merged 1 commit into
mainfrom
fix/integration-residual-greptile
Jun 20, 2026
Merged

fix(integration): clear residual greptile findings on the L3 gate#817
Yiminnn merged 1 commit into
mainfrom
fix/integration-residual-greptile

Conversation

@Yiminnn

@Yiminnn Yiminnn commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Cross-checked every greptile comment left unaddressed on the earlier integration PRs (#802#810). Most were already fixed by later work or stale; this PR closes the genuine ones.

Triage of the 9 unaddressed comments

file:line P verdict
#806 integration-final-review.yml:195 (filter fail-open) P1 FIXEDfilter_matrix catches import + per-cell exceptions (fail-open)
#810 build_integration_review_pack.py (deterministic_reject left True) P1 OPEN → fixed here
#810 codex_review.py (endswith('/v1')) P2 STALE — heuristic gone (Moon Bridge)
#808 codex_review.py:281 (CODEX_API_KEY in env) P2 cosmetic/intentional — kept (codex ignores it; Moon Bridge routes via config.toml)
#802 codex_review.py:191 (greedy {.*}) P2 OPEN → fixed here
#802 integration_matrix.py:1121 (CalledProcessError) P2 OPEN → fixed here
#802 codex_review.py:163 ([:8000] cut) P2 cosmetic-accepted — fenced untrusted text the LLM reads as prose
#802 integration-scope.yml:265 (empty matrix) P1 FIXEDhas_cells green-no-op guard
#802 integration-final-review.yml:430 (DeepSeek creds bleed into codex) P1 FIXED_codex_env isolation

Fixes in this PR

  1. P1 — deterministic_reject left True after R-OUTCOME demotion. When an R-OUTCOME-only reject is demoted to healthy + quarantine, the grade still serialized deterministic_reject: True into agent_judge_summary.json. codex reads that, and a "healthy slot with a deterministic reject" can spuriously push it to downgrade mergeable → not mergeable — directly undermining the gate calibration in fix(integration): calibrate L3 gate — slot matching, V-TAMPER false-positive, codex robustness #814. Now cleared on demotion. + regression test.
  2. Greedy JSON regex in the deepseek per-rollout finding parser → json.JSONDecoder().raw_decode of the first complete object (tolerant of trailing prose; a greedy {.*} merged two objects into one invalid blob, losing the finding).
  3. Uncaught CalledProcessError from git diff --check=True → re-raised as ScopeError so a failed/absent head_sha fails closed cleanly instead of an uncaught traceback.

Validation: 112 tests pass (incl. the new regression); ruff clean.

Cross-checked the unaddressed greptile comments on the earlier integration PRs
(#802-#810); most were already fixed or stale. Three genuine ones remained:

- P1 (build_integration_review_pack): an R-OUTCOME-only deterministic reject is
  demoted to a healthy+quarantine slot, but grade["deterministic_reject"] was
  left True and serialized into agent_judge_summary.json. codex reads that and a
  "healthy slot with a deterministic reject" can spuriously push it to downgrade
  mergeable -> not mergeable. Clear the flag on demotion. + regression test.
- codex_review deepseek-finding parser: replace the greedy `{.*}` DOTALL regex
  (which merges two JSON objects into one invalid blob) with raw_decode of the
  first complete object, tolerant of trailing prose.
- integration_matrix: `git diff` runs with check=True but CalledProcessError
  bypassed the ScopeError fail-closed handler (uncaught traceback). Re-raise it
  as ScopeError so a failed/absent head_sha fails closed cleanly.
@Yiminnn Yiminnn temporarily deployed to pypi-internal-preview June 20, 2026 19:32 — with GitHub Actions Inactive
@greptile-apps

greptile-apps Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR closes three genuine residual findings from earlier integration PRs: it clears deterministic_reject when an R-OUTCOME-only reject is demoted to healthy, replaces a greedy regex with json.JSONDecoder().raw_decode for per-rollout DeepSeek finding parsing, and wraps the bare subprocess.CalledProcessError from git diff inside a ScopeError for clean fail-closed handling.

  • build_integration_review_pack.py: slot.grade[\"deterministic_reject\"] is now set to False immediately before the quarantine message is appended during R-OUTCOME demotion, eliminating the contradictory "healthy slot with a deterministic reject" state that could spuriously push the codex reviewer to downgrade a verdict. A regression test is included.
  • codex_review.py: raw_decode replaces the greedy {.*} regex; it stops at the first complete JSON object and tolerates trailing prose without merging sibling objects into an invalid blob.
  • integration_matrix.py: CalledProcessError from git diff --name-only is caught and re-raised as ScopeError, which the existing main() handler already catches cleanly.

Confidence Score: 5/5

All three changes are narrow, targeted fixes with no new surface area introduced; the deterministic_reject demotion fix is backed by a regression test.

Each change addresses a well-scoped bug: the deterministic_reject flag is cleared at exactly the right point in _classify_one, the raw_decode replacement handles every edge case the old regex missed, and the ScopeError wrapping is consistent with every other error site in integration_matrix.py. The regression test correctly exercises the demotion path by monkey-patching via the module reference. No pre-existing logic is altered beyond the minimal targeted lines.

No files require special attention — all four changed files are straightforward and self-contained.

Important Files Changed

Filename Overview
.github/scripts/build_integration_review_pack.py Adds slot.grade["deterministic_reject"] = False before the quarantine append during R-OUTCOME-only demotion; fix is minimal and correctly placed.
.github/scripts/codex_review.py Replaces greedy {.*} regex with raw.find("{") + json.JSONDecoder().raw_decode; correctly handles trailing prose and multiple sibling JSON objects.
.github/scripts/integration_matrix.py Wraps subprocess.CalledProcessError from git diff in ScopeError; consistent with all other error-handling sites in this file and caught cleanly by main().
tests/test_build_review_pack.py Adds a focused regression test for the deterministic_reject demotion fix; monkey-patch of rubric_checks.grade_rollout is correct because build_integration_review_pack accesses it via the module reference.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_classify_one] --> B{deterministic_reject?}
    B -- No --> C[slot.status = healthy]
    B -- Yes --> D{non_outcome rejects?}
    D -- Yes --> E[slot.status = unhealthy]
    D -- No R-OUTCOME only --> F[slot.status = healthy]
    F --> G["slot.grade['deterministic_reject'] = False NEW"]
    G --> H[Append R-OUTCOME quarantine message]
    H --> I[Serialize to agent_judge_summary.json]
    I --> J[codex_review reads summary]
    J --> K{deterministic_reject?}
    K -- False --> L[No spurious downgrade]
    K -- True was bug --> M[Spurious not-mergeable downgrade]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[_classify_one] --> B{deterministic_reject?}
    B -- No --> C[slot.status = healthy]
    B -- Yes --> D{non_outcome rejects?}
    D -- Yes --> E[slot.status = unhealthy]
    D -- No R-OUTCOME only --> F[slot.status = healthy]
    F --> G["slot.grade['deterministic_reject'] = False NEW"]
    G --> H[Append R-OUTCOME quarantine message]
    H --> I[Serialize to agent_judge_summary.json]
    I --> J[codex_review reads summary]
    J --> K{deterministic_reject?}
    K -- False --> L[No spurious downgrade]
    K -- True was bug --> M[Spurious not-mergeable downgrade]
Loading

Reviews (1): Last reviewed commit: "fix(integration): clear residual greptil..." | Re-trigger Greptile

@Yiminnn Yiminnn merged commit 7a0f1ba into main Jun 20, 2026
8 checks passed
@Yiminnn Yiminnn deleted the fix/integration-residual-greptile branch June 20, 2026 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant