Skip to content

fix(SESF-44): tighten Tier-2 auth keyword + reject non-literal assignment values#42

Merged
lbruton merged 3 commits into
mainfrom
sketch/SESF-44-tier2-assignment-precision
Jun 15, 2026
Merged

fix(SESF-44): tighten Tier-2 auth keyword + reject non-literal assignment values#42
lbruton merged 3 commits into
mainfrom
sketch/SESF-44-tier2-assignment-precision

Conversation

@lbruton

@lbruton lbruton commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

Precision tune of the Tier-2 contextual-assignment detector in secret_redaction.py. Eliminates the two false-positive classes that dominated the SESF-42 live baseline (~47% of 882 Tier-2 hits), with no loss of recall on genuine secrets.

  • D-1 — auth keyword: drop bare auth (it substring-matched the author* / authority / co-author git-metadata family) and add a left-anchored auth family (?<![A-Za-z])(?:oauth|authorization|auth[_-]?token). The lookbehind stops oauth from re-matching inside coauthor / coAuthoredBy; auth_token still matches via the existing bare token keyword.
  • D-2 — non-literal values: new pure _is_nonliteral_assignment(line, match) drops command-substitution $(…) and env-interpolation ${…} matches — including the case where the secret keyword is matched inside the braces (${FOO_PASSWORD:-default} → key FOO_PASSWORD, value -default), which a value-shape check alone misses.

Both fixes sit upstream of the shared _aggregate_maskable_candidates, so they benefit both redact() (SESF-41 ingestion guard) and scan_spans() (SESF-42 sanitizer) with zero call-site changes.

Test plan

  • New RED→GREEN tests in tests/test_secret_redaction.py, each asserted through both redact() and scan_spans():
    • AC-1: author/authored/authoredDate/authorAssociation/authority/gradingAuthority/coauthor/coAuthoredBy → not flagged
    • AC-3/AC-4: $(security …) and ${FOO_PASSWORD:-default} → not flagged
    • AC-2/AC-5: authorization/bare oauth/oauth_token/auth_token/auth-token/authToken + literal FOO_API_KEY → still flagged
  • Full suite: 378 passed (360 baseline + 18 new), ruff clean, Codacy 0 introduced findings (Trivy secret-scan clean on the synthetic test fixtures).

Links

  • Issue: SESF-44 (Plane)
  • Sketch: DocVault/Projects/SessionFlow/sketches/SESF-44-tier2-assignment-precision/ (requirements → discovery → approach → tasks, all Codex-reviewed + reconciled)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Improved Tier-2 assignment secret detection to reduce false positives for authorization-related keys, command-substitution values, and ${...} interpolation patterns (including default/error forms).
    • Continued to flag genuine secret-bearing assignment values and properly mask literal secret content.
  • Tests

    • Added SESF-44 coverage validating suppression of Tier-2 hits in both enforcement redaction and reporting scan modes.
    • Added checks for literal-value detection and consistency with non-secret placeholder assignments.

lbruton added 2 commits June 14, 2026 18:19
Cover the author*/authority/co-author FP family, $(...) command-sub and
${...} env-interpolation FP values, and genuine-key true positives
(incl. bare oauth) through both redact() and scan_spans(). 10 RED, 8 green.
D-1: drop bare `auth` (substring-matched author*/authority/co-author git
metadata); add left-anchored `(?<![A-Za-z])(?:oauth|authorization|auth[_-]?token)`
so `oauth` no longer matches inside `coauthor`/`coAuthoredBy`.

D-2: new pure `_is_nonliteral_assignment(line, match)` drops command-sub
`$(...)` and env-interpolation `${...}` matches (incl. the keyword-inside-braces
case where the captured value is the default, not `${...}`).

Both fixes sit upstream of the shared aggregation, so redact() (SESF-41 guard)
and scan_spans() (SESF-42 sanitizer) both benefit. 378 passed, ruff clean.
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f58e485e-1762-4984-bab3-7e52a308c1e6

📥 Commits

Reviewing files that changed from the base of the PR and between 84d9f53 and 079d327.

📒 Files selected for processing (2)
  • secret_redaction.py
  • tests/test_secret_redaction.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_secret_redaction.py
  • secret_redaction.py

Walkthrough

Tightens Tier-2 ASSIGNMENT detection in secret_redaction.py per SESF-44: the _SECRET_KEYWORD regex loses bare auth and gains a negative lookbehind; a new _INTERPOLATION_SPAN_RE and _is_nonliteral_assignment helper together filter out command-substitution and interpolation values in _collect_candidates. Tests cover all suppression and positive-control cases.

Changes

SESF-44 Tier-2 Assignment Precision

Layer / File(s) Summary
Keyword regex and interpolation span detection
secret_redaction.py
_SECRET_KEYWORD drops bare auth and adds a negative lookbehind to prevent oauth/authorization substring hits; _INTERPOLATION_SPAN_RE is added to detect ${…} spans needed by the downstream filter.
_is_nonliteral_assignment helper and _collect_candidates guard
secret_redaction.py
New _is_nonliteral_assignment returns True when a matched value begins with $( or sits inside a ${…} span; _collect_candidates skips emitting an ASSIGNMENT candidate when it returns True.
SESF-44 tests: helpers, suppression, and positive controls
tests/test_secret_redaction.py
Adds _has_tier2_hit/_has_assignment_span helpers; parametrized suppression tests for author/authority/co-author keys, $(...) values, and ${VAR:-default} interpolations; positive controls for genuine secret keys with literal values; and a none-value placeholder consistency test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • lbruton/SessionFlow#37: Directly modifies the same Tier-2 contextual assignment detection logic in secret_redaction.py and its corresponding test file, making it a direct predecessor to this change.

Poem

A bare auth once slipped through the net,
Now a lookbehind keeps it in check.
$(command) and ${VAR}? Not a threat —
The literal alone earns the red.
SESF-44, no false alarms yet. 🔐

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the two main changes: tightening the auth keyword regex and rejecting non-literal assignment values, both core to fixing the SESF-44 false positives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sketch/SESF-44-tier2-assignment-precision

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production

codacy-production Bot commented Jun 14, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 18 complexity · 0 duplication

Metric Results
Complexity 18 (≤ 500 complexity)
Duplication 0 (≤ 5 duplication)

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The pull request effectively refines the Tier-2 secret detector to reduce false positives by tightening the 'auth' keyword regex and introducing logic to ignore non-literal assignments (command substitutions and environment interpolations). Codacy analysis indicates the changes are up to standards.

However, a logic issue in secret_redaction.py was identified. An 'if' statement intended to catch non-literal values is unreachable because the regex used to capture the assignment value explicitly excludes curly braces. This should be corrected to ensure command substitutions are correctly ignored as per the acceptance criteria.

Test suggestions

  • Author family keys (author, authoredDate, coauthor, etc.) with literal values are not flagged.
  • Genuine keys (oauth, authorization, auth_token, authToken) with literal values are still flagged.
  • Command substitution values like $(security ...) are ignored.
  • Environment variable interpolation like ${FOO_PASSWORD:-default} is ignored even when the key matches.
  • Simple environment variables like ${VALUE} are ignored.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread secret_redaction.py Outdated

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aims to refine secret detection by narrowing the 'auth' keyword scope and ignoring non-literal assignments. However, two critical issues prevent approval: the production code changes (secret_redaction.py) are missing from the provided diff, and the logic to exclude the 'author' family currently risks a regression by dropping the bare 'auth' keyword entirely.

While the Codacy analysis is positive, the test suite requires expansion. Specifically, the implementation should be verified against more shell-style variations (quotes, backticks) and ensure parity between JSON and shell-assignment formats. The regex implementation should also be checked for case-sensitivity once the source code is provided.

About this PR

  • The actual implementation changes in secret_redaction.py (including the regex updates and the _is_nonliteral_assignment function) are missing from the provided diff. Only the test file is visible. Please ensure all modified files are properly staged and pushed so the logic can be verified.

Test suggestions

  • Verify 'author' family keys with literal values do not trigger Tier-2 hits.
  • Verify command substitution values (e.g., $(security ...)) are not flagged.
  • Verify environment variable interpolation values (e.g., ${VAR:-default}) are not flagged.
  • Verify genuine keys like 'oauth' and 'authorization' with literal values are still flagged.
  • Verify that placeholder values (e.g., 'none') for genuine keys remain unflagged.
  • Verify quoted command substitutions (e.g., KEY="$(command)") are not flagged.
  • Verify shell backticks (e.g., KEY=command) are not flagged.
  • Verify additional interpolation operators (e.g., ${VAR:?error}) are not flagged.
Prompt proposal for missing tests
Consider implementing these tests if applicable:
1. Verify quoted command substitutions (e.g., KEY="$(command)") are not flagged.
2. Verify shell backticks (e.g., KEY=`command`) are not flagged.
3. Verify additional interpolation operators (e.g., ${VAR:?error}) are not flagged.
Low confidence findings
  • Once the production code is available, please verify that the regex (?<![A-Za-z])(?:oauth|authorization|auth[_-]?token) handles case-insensitivity correctly (e.g., for 'authToken') to ensure it meets the requirements for diverse coding styles.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread tests/test_secret_redaction.py
Comment thread tests/test_secret_redaction.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@secret_redaction.py`:
- Around line 103-112: The _SECRET_KEYWORD regex pattern definition uses a
Bandit suppression comment (# nosec B105) on line 103, but Ruff is still
flagging this code with the S105 error on the subsequent lines containing the
keyword patterns. Replace the Bandit-specific # nosec B105 comment with Ruff's
compatible suppression format (# noqa: S105) so that the security linting check
is properly suppressed for this intentional detector regex.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bcf6144a-0cc0-457a-8fd6-db77bf4aa7ae

📥 Commits

Reviewing files that changed from the base of the PR and between c642208 and 84d9f53.

📒 Files selected for processing (2)
  • secret_redaction.py
  • tests/test_secret_redaction.py

Comment thread secret_redaction.py
- Remove unreachable `value.startswith("${")` branch in _is_nonliteral_assignment
  (the value capture class excludes `{`, so ${ can never start a captured value;
  interpolation is handled by the span-overlap check). [codacy 3410428483]
- Add `# noqa: S105` on the keyword string so CodeRabbit's S-enabled ruff passes
  (project ruff is D1-only; this is keyword NAMES, not a credential). [coderabbit 3410431432]
- Tests: add shell-form parity to the AC-1 author-family cases, a quoted
  command-substitution case, and a ${VAR:?error} interpolation case. [codacy 3410428669]

378 passed, ruff clean (incl. S105).
@lbruton lbruton merged commit a7d5a37 into main Jun 15, 2026
5 checks passed
@lbruton lbruton deleted the sketch/SESF-44-tier2-assignment-precision branch June 15, 2026 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant