Skip to content

feat(SESF-42): retroactive secret sanitizer (CLI + MCP)#40

Merged
lbruton merged 7 commits into
mainfrom
feat/SESF-42-retroactive-secret-sanitizer
Jun 13, 2026
Merged

feat(SESF-42): retroactive secret sanitizer (CLI + MCP)#40
lbruton merged 7 commits into
mainfrom
feat/SESF-42-retroactive-secret-sanitizer

Conversation

@lbruton

@lbruton lbruton commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Adds an operator-driven sanitizer that removes secrets already indexed in the Milvus document field, the FTS5 content column, and the embedding vector — the retroactive half of the SESF-35 research (SESF-41 shipped the ingestion guard). Exposed via a cleanup.py sanitize CLI subcommand and a dedicated sanitize_index MCP tool.

  • Dry-run is the default — reports per-rule counts + affected turns and writes a value-free 0600 audit JSONL (~/.sessionflow/audit/); zero writes to any store.
  • --apply --yes redacts in place and re-embeds the redacted text (throttled through the embedding budget, 200ms floor; checkpointed/resumable). --apply --yes --drop deletes affected turns instead.
  • Both surfaces refuse to apply without explicit confirmation and never emit a secret value.

New primitives

  • secret_redaction.scan_spans — per-occurrence, value-free audit spans (reuses the SESF-41 detector); snippets mask every detected span in-window.
  • rag_engine.upsert_document (Milvus upsert-by-PK + FTS metadata-preserving rewrite) and delete_by_doc_id (DeleteResult surfaces FTS outcome). FTS-rewrite/-delete failure keeps a turn retryable, never silently "done".

Linked issue

  • Plane SESF-42 — Retroactive secret sanitizer for already-indexed data (CLI + MCP)

Tests

  • pytest full suite: 349 passed / 0 failed (300 baseline + 49 new). ruff clean.
  • TDD: 26 failing tests written before implementation; +5 regression tests from CodeRabbit review.
  • Codacy local SAST: PASS (0 introduced Critical/High).
  • CodeRabbit: 3 correctness findings (empty-vector-on-resume, FTS metadata loss, drop done-marking) — all fixed + regression-tested.

Live exposure baseline (dry-run, no writes)

Ran against the live Standalone index: 1296 affected turns (HIGH_ENTROPY 6136, ASSIGNMENT 882; codex 5628 / claude_code_cli 980 / antigravity_desktop 203 / opencode 118 / antigravity_cli 89). Audit verified value-free on real data (7018 records, 0 raw-value fields, 0 leaking snippets, 0600).

Out of scope

  • Destructive --apply on the live index (a separate, deliberate operator action).
  • Upstream source-transcript rewriting (SessionFlow sanitizes only what it owns).

Version

No version bump — SessionFlow has no version-lock system.

Redaction is irreversible and is not a substitute for rotation — rotate any key that was ever indexed.

Summary by CodeRabbit

  • New Features
    • Added a retroactive secret sanitization workflow to find previously stored secrets and either redact them or remove affected turns.
    • Dry-run is the default, with explicit confirmation required for any destructive action.
    • Added scope filters to target specific projects, sessions, providers, or date ranges.
    • Sanitization reports now show counts, status, and audit details without exposing secret values.

lbruton added 3 commits June 13, 2026 17:20
Add an operator-driven sanitizer that removes secrets already indexed in
Milvus (document), FTS5 (content), and the embedding vector.

- secret_redaction.scan_spans: per-occurrence, value-free audit spans
  (rule/tier/offset/length/masked_snippet); masked_snippet masks every
  detected span in its window (no raw value, any tier).
- rag_engine.upsert_document / delete_by_doc_id: Milvus upsert-by-PK +
  FTS delete-then-insert (redact) / dual delete (drop); FTS failure
  surfaced distinctly so a row stays retryable.
- sanitize.py: scan (dry-run) / apply (redact|drop) orchestrator with
  worklist+done checkpoint, throttled re-embed (200ms floor), and 0600
  value-free audit JSONL.
- cleanup.py sanitize subcommand (refuse without --yes) + tools.py
  sanitize_index MCP tool (refuse apply without confirm).
- Tests: 44 new (scan_spans, orchestrator, primitives, CLI, MCP).

Live dry-run baseline: 1296 affected turns (HIGH_ENTROPY 6136,
ASSIGNMENT 882); audit verified value-free on real data.

Refs SESF-42
- upsert_document: new_vector now Optional; None preserves the stored
  vector (resume/FTS-only converge no longer corrupts the 768-dim row).
- upsert_document: FTS rewrite rebuilds the full metadata record from the
  fetched row (was dropping ~13 columns → degraded filtered/BM25 search).
- delete_by_doc_id now returns DeleteResult(deleted, fts_ok); drop path
  marks a turn done only when both Milvus and FTS deletes succeed,
  mirroring the redact FTS-incomplete contract.
- _build_worklist caches scanned rows so the resume/no-spans path is
  reachable.
- tools.py refusal wording reflects drop; sanitize.apply pause_event typed.
- +5 tests for the previously-mocked vector/metadata/drop-FTS paths.

Refs SESF-42
Copilot AI review requested due to automatic review settings June 13, 2026 22:39
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@lbruton, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 28 minutes and 39 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 44f2194a-338b-4380-b848-4a43fcb44f02

📥 Commits

Reviewing files that changed from the base of the PR and between ddd56ea and e0fe9b1.

📒 Files selected for processing (6)
  • rag_engine.py
  • sanitize.py
  • secret_redaction.py
  • tests/test_sanitize.py
  • tests/test_secret_redaction.py
  • tools.py

Walkthrough

Implements SESF-42: a retroactive secret sanitizer that detects previously-indexed secrets in Milvus and FTS5. Adds secret_redaction.scan_spans(), dual-write rag_engine.upsert_document/delete_by_doc_id primitives, a sanitize.py orchestrator with dry-run/apply/drop modes, a cleanup.py sanitize CLI subcommand, and a sanitize_index MCP tool, each with comprehensive tests and docs.

Changes

SESF-42 Retroactive Secret Sanitizer

Layer / File(s) Summary
Span detection primitive
secret_redaction.py, tests/test_secret_redaction.py
Adds Span NamedTuple and public scan_spans(text, *, mode, allowlist) with internal _maskable_values() and _collect_spans(). Returns per-occurrence spans with value-free masked_snippet; identical span list across "report" and "enforce" modes. Tests assert per-occurrence semantics, offset accuracy, tier assignment, snippet value-freedom (including neighbor-overlap and keyword-outside-window edge cases), determinism, and redact() contract stability.
Dual-write RAG engine primitives
rag_engine.py, tests/test_sanitize.py (primitive harness)
Adds UpsertResult/DeleteResult NamedTuples and upsert_document()/delete_by_doc_id(). Each function writes Milvus then FTS5 and surfaces milvus_ok/fts_ok independently; new_vector=None preserves the existing stored vector. Primitive contract tests verify Milvus upsert + FTS delete-then-insert ordering, metadata copying, distinct failure reporting, and dual-delete behavior.
sanitize.py orchestrator
sanitize.py, tests/test_sanitize.py (orchestration tests)
New module with Scope (filter builder), SanitizeReport, keyset-batched scan(), and apply() with per-turn checkpointing, FTS-failure incomplete tracking, pause-event support, throttled re-embedding, and value-free JSONL audit trail under ~/.sessionflow/audit/ (0600). _apply_redact() re-embeds redacted text; _apply_drop() deletes instead. Tests cover scope filtering, dry-run no-writes contract, redact/drop apply, confirmation gate, FTS-failure incompletion, resume/skip semantics, budget throttling, and no-leak invariants.
CLI subcommand
cleanup.py, tests/test_cleanup_sanitize.py
Adds cmd_sanitize() and _print_sanitize_report() plus parser wiring for sanitize (mutually exclusive --dry-run/--apply, --drop, --yes, and scoping flags). --apply requires --yes; --drop requires --apply. Tests cover parser registration, default dry-run posture, confirmation gate refusal, apply dispatch, scope flag mapping, and CLI no-leak invariant.
MCP tool adapter
tools.py, tests/test_tools_sanitize.py
Adds format_sanitize_report() markdown formatter, sanitize_index tool registration (schema: apply, drop, confirm, scoping fields — all optional, none required), and call_tool dispatch with scan/refuse/apply routing. Tests assert schema shape, default scan path, confirm-gate refusal, apply-with-confirm flag threading, drop propagation, and no-leak invariant across all response paths.
Feature docs
README.md, CLAUDE.md, CHANGELOG.md
Adds README section on cleanup.py sanitize usage (dry-run, --apply/--drop/--yes, scoping, audit trail format, rotation warning). CLAUDE.md adds SESF-42 operational notes on FTS retry semantics, irreversibility, and checkpoint locations. CHANGELOG records new primitives and behaviors.

Sequence Diagram(s)

sequenceDiagram
  participant Operator
  participant cleanup.py / sanitize_index
  participant sanitize.py
  participant secret_redaction
  participant rag_engine
  participant Milvus
  participant FTS5

  rect rgba(100, 100, 200, 0.5)
    note over Operator,FTS5: Dry-run (scan)
    Operator->>cleanup.py / sanitize_index: sanitize (no --apply / apply=false)
    cleanup.py / sanitize_index->>sanitize.py: scan(scope)
    sanitize.py->>Milvus: keyset-batch query in-scope rows
    Milvus-->>sanitize.py: rows
    sanitize.py->>secret_redaction: scan_spans(document, mode="report")
    secret_redaction-->>sanitize.py: (text, spans)
    sanitize.py-->>cleanup.py / sanitize_index: SanitizeReport(mode="dry-run", audit_path)
    cleanup.py / sanitize_index-->>Operator: counts + audit path (no secret values)
  end

  rect rgba(100, 200, 100, 0.5)
    note over Operator,FTS5: Apply (redact or drop)
    Operator->>cleanup.py / sanitize_index: sanitize --apply --yes / apply=true, confirm=true
    cleanup.py / sanitize_index->>sanitize.py: apply(scope, drop, confirmed=True)
    loop per doc_id in worklist
      alt drop=False
        sanitize.py->>secret_redaction: scan_spans(document, mode="enforce")
        secret_redaction-->>sanitize.py: (redacted_text, spans)
        sanitize.py->>rag_engine: upsert_document(doc_id, redacted_text, new_vector)
        rag_engine->>Milvus: upsert row
        rag_engine->>FTS5: delete then insert
        rag_engine-->>sanitize.py: UpsertResult(milvus_ok, fts_ok)
      else drop=True
        sanitize.py->>rag_engine: delete_by_doc_id(doc_id)
        rag_engine->>Milvus: delete row
        rag_engine->>FTS5: delete row
        rag_engine-->>sanitize.py: DeleteResult(deleted, fts_ok)
      end
      sanitize.py->>sanitize.py: checkpoint after each turn
    end
    sanitize.py-->>cleanup.py / sanitize_index: SanitizeReport(rotate_warning=True)
    cleanup.py / sanitize_index-->>Operator: report (no secret values)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • lbruton/SessionFlow#37: Introduced the tiered secret redaction engine (secret_redaction.py, redact(), Hit) that this PR directly extends with the new Span type and scan_spans() per-occurrence audit API.

Poem

Old secrets crept inside the index, hiding in the dark,
Now scan_spans hunts them one by one and leaves a value-free mark.
With --apply --yes the redact path rewrites and re-embeds,
Or --drop deletes the row entire — no trace of what it said.
Checkpointed, throttled, audit-logged, with FTS errors tracked,
The secrets gone, the index clean — rotation still exact. 🔑

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding a retroactive secret sanitizer exposed through CLI and MCP.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/SESF-42-retroactive-secret-sanitizer

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production

codacy-production Bot commented Jun 13, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 315 complexity · 4 duplication

Metric Results
Complexity 315 (≤ 500 complexity)
Duplication 4 (≤ 5 duplication)

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR is currently not up to standards due to critical scalability risks and a failure to correctly implement hardware-constrained throttling. While the core functionality for retroactive sanitization is present, the implementation contains a high-severity risk of Out-Of-Memory (OOM) errors when processing large indices because full document payloads are cached in memory.

Additionally, the re-embedding throttle contains a logic flaw that allows it to bypass safety waits, violating the requirement for hardware-aligned budget management. Finally, there is significant logic duplication between real-time and retroactive scanning paths, which must be refactored to ensure consistent secret prioritization and maintainability.

About this PR

  • The orchestration logic for scanning rows is not scalable. Caching full document payloads for every turn in a repository will lead to OOM crashes during large-scale index cleaning. The system should be refactored to process payloads just-in-time or only cache lightweight metadata.
  • There is a systemic duplication of secret candidate aggregation and filtering logic between the live redaction module and this new retroactive scanner. This creates a high risk of 'behavioral drift' where real-time and retroactive scanning apply different rules to the same content.

Test suggestions

  • Verify that a dry-run identifies secrets and generates an audit log without modifying Milvus or FTS.
  • Verify that apply mode refuses to execute and returns an error/status when confirmation is missing.
  • Verify that the audit log and CLI/MCP output contain only rule names, counts, and masked snippets, never raw secret values.
  • Verify that a resumed run skips doc_ids already marked 'done' in the checkpoint.
  • Verify that re-embedding calls the embedding budget's before_batch/after_batch for throttling.
  • Verify that FTS failures during a redact or drop operation leave the turn as retryable (not marked done).
  • Verify that upsert_document preserves all Milvus metadata fields while updating the document and vector.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread sanitize.py Outdated
Comment thread secret_redaction.py Outdated
Comment thread sanitize.py Outdated
Comment thread sanitize.py

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The pull request is currently not up to standards due to high-risk performance issues and security vulnerabilities. Specifically, the _build_worklist function in sanitize.py poses a significant risk of OutOfMemoryError when operating on large Milvus indices because it attempts to cache full document content in memory.

From a security and integrity perspective, the audit logging mechanism contains a race condition in permission handling and incorrectly truncates existing logs when a sanitization run is resumed. These issues must be addressed to ensure a secure and reliable audit trail. While the implementation successfully meets most acceptance criteria regarding dry-run defaults and confirmation prompts, the code duplication identified in redaction logic should also be resolved to prevent policy drift.

Test suggestions

  • CLI 'sanitize' defaults to dry-run and reports detections without writing
  • CLI 'sanitize --apply' refuses to proceed without the '--yes' confirmation flag
  • MCP 'sanitize_index' tool refuses to apply without 'confirm=true'
  • Redaction mode re-embeds the masked text using the throttled embedding budget
  • Drop mode deletes rows from both Milvus and FTS stores
  • FTS rewrite failure results in 'incomplete' status and leaves the turn in the worklist
  • Audit logs use 0600 permissions and do not leak raw secret values in snippets
  • FTS metadata is preserved during an in-place update
  • Sanitizer can resume an interrupted apply run from a durable checkpoint
  • Snippet generation for forced entropy tokens is value-free even if the keyword is outside the window

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread sanitize.py
Comment thread secret_redaction.py Outdated
Comment thread sanitize.py Outdated
Comment thread sanitize.py Outdated
Comment thread tests/test_sanitize.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SESF-42 “retroactive secret sanitizer” capability to remove secrets that were already indexed (Milvus document, FTS5 content, and derived embeddings), exposed via both the cleanup.py sanitize CLI and a new sanitize_index MCP tool. This extends the existing SESF-41 ingestion-time guard by enabling operator-driven cleanup of historical data while preserving the “no secret value emitted” invariant via span-aware, value-free auditing.

Changes:

  • Introduces secret_redaction.scan_spans and span/snippet data model to support per-occurrence, value-free auditing.
  • Adds sanitizer orchestrator (sanitize.py) plus CLI/MCP adapters (confirmation-gated apply; dry-run default) and extensive tests.
  • Adds rag_engine.upsert_document / delete_by_doc_id primitives to rewrite/delete across Milvus + FTS with distinct FTS success reporting.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools.py Registers sanitize_index MCP tool and adds markdown formatting for sanitize reports.
sanitize.py Implements scan/apply orchestration, audit JSONL writing, checkpointing, and embedding throttling.
secret_redaction.py Adds Span + scan_spans() for per-occurrence, value-free audit spans/snippets.
rag_engine.py Adds upsert_document / delete_by_doc_id primitives and result types for dual-store operations.
cleanup.py Adds cleanup.py sanitize subcommand and value-free report printing with confirmation gate.
README.md Documents retroactive sanitizer usage, flags, and audit/no-leak guarantees.
CHANGELOG.md Notes the new SESF-42 sanitizer feature set and operator warnings.
tests/test_tools_sanitize.py Tests MCP adapter wiring, confirmation gating, and no-leak output contract.
tests/test_cleanup_sanitize.py Tests CLI wiring, refusal-before-reads/writes semantics, and no-leak output.
tests/test_secret_redaction.py Adds regression coverage for span/snippet masking invariants in scan_spans.
tests/test_sanitize.py Adds orchestrator + rag_engine primitive tests for scan/apply/drop/resume/budget behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sanitize.py Outdated
Comment thread rag_engine.py
Comment thread rag_engine.py
Comment thread tests/test_sanitize.py
@lbruton lbruton marked this pull request as ready for review June 13, 2026 23:01
Copilot AI review requested due to automatic review settings June 13, 2026 23:01
@lbruton lbruton removed the request for review from Copilot June 13, 2026 23:05

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
sanitize.py (3)

243-252: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Audit file truncates existing logs on resume and has a TOCTOU race on permissions.

Opening with "w" mode destroys any prior audit entries for this run_id (e.g. on resume). The chmod after open leaves a window where the file exists with default umask permissions. Use atomic open with mode:

-        self._handle = open(self.path, "w", encoding="utf-8")
-        try:
-            os.chmod(self.path, _FILE_MODE)
-        except OSError:
-            pass
+        fd = os.open(self.path, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, _FILE_MODE)
+        self._handle = os.fdopen(fd, "w", encoding="utf-8")

For resume scenarios, consider append mode (O_APPEND) instead of truncate, or use a fresh run_id.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 243 - 252, Update the audit file initialization in
sanitize.py’s __init__ so it does not truncate existing run logs on resume;
switch the file open behavior away from "w" to an atomic create/open strategy
with the intended permissions, and use append semantics if the same run_id must
preserve prior entries. Keep the permission-setting logic tied to the open path
to avoid the post-open TOCTOU window, and adjust the _audit_path/run_id handling
only if needed to support a fresh run_id for truncate-on-new-run behavior.

307-318: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

O(N) membership check on list creates O(N²) scan complexity.

doc_id not in worklist is O(N) for lists. With large indices this becomes a bottleneck. Use a set for tracking seen doc_ids:

     counts: dict = {}
-    worklist: list[str] = []
+    worklist: list[str] = []
+    worklist_set: set[str] = set()
     audit = _AuditWriter(run_id)
     try:
         for row in _iter_rows(scope, db_path):
             ...
             doc_id = row.get("doc_id")
-            if doc_id not in worklist:
+            if doc_id not in worklist_set:
+                worklist_set.add(doc_id)
                 worklist.append(doc_id)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 307 - 318, The worklist variable is initialized as
a list but used with the membership check operator (not in), which performs an
O(N) search. This creates O(N²) overall complexity when looping through rows.
Change the worklist initialization from list[str] = [] to set[str] = set(), and
replace the worklist.append(doc_id) call with worklist.add(doc_id). This will
reduce the membership check to O(1) and overall scan complexity to O(N).

368-375: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Throttle loop exits without enforcing the 200ms cooldown floor when budget denies but provides no delay.

When allowed=False and retry_after_seconds is 0/None, the loop breaks immediately without any wait. Per CLAUDE.md and retrieved learnings, the embedding backfill must never reduce cooldown below 100ms due to MLX Metal SIGSEGV risk.

             delay = getattr(decision, "retry_after_seconds", 0.0) or 0.0
-            if delay <= 0:
-                break
+            if delay <= 0:
+                delay = 0.2  # enforce 200ms floor per CLAUDE.md
             time.sleep(delay)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 368 - 375, The throttle loop in sanitize.py should
always enforce a minimum cooldown instead of breaking immediately when
budget.before_batch() returns allowed=False with no retry delay. Update the
backfill wait logic in the while True block so a denied decision with
retry_after_seconds missing or zero still sleeps for the required floor (at
least 200ms, and never below the documented 100ms minimum), then retries rather
than exiting the loop.

Sources: Coding guidelines, Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rag_engine.py`:
- Around line 2041-2046: Update rag_engine.upsert_document so the FTS rewrite
uses the same UTF-8 truncated text stored in row["document"] rather than
new_document; keep the Milvus upsert and metadata handling unchanged, but pass
the truncated document value into the FTS insert path at the anchor site and the
sibling site (rag_engine.py 2041-2046 and 2061-2063) to preserve dual-write
alignment.
- Around line 2057-2085: The FTS rewrite logic in rag_engine.py only closes the
ephemeral SQLite connection on the success path; update both rewrite blocks
around _fts.connection(), _fts.delete(), and _fts.insert() so
_fts.close_ephemeral(conn) runs in a finally block (or equivalent) whenever conn
is created, including when delete/insert raises. Keep fts_ok handling and
logging intact, but ensure the connection is always released on failure paths.

In `@tests/test_sanitize.py`:
- Around line 306-317: In the test function
test_apply_without_confirmation_makes_no_calls, add an assertion that verifies
no reads were performed on the stubbed engine when confirmed=False is passed to
sanitize.apply(). Currently the test only asserts that writes (upserts, deletes,
embed_inputs) are blocked, but does not verify that the code skips reading from
the index altogether. Add an assertion checking the appropriate field in the cap
object that tracks read/search operations to ensure that reads are also blocked
when confirmation is not provided, consistent with the security requirement that
destructive sanitize operations must refuse unless explicitly confirmed.

---

Duplicate comments:
In `@sanitize.py`:
- Around line 243-252: Update the audit file initialization in sanitize.py’s
__init__ so it does not truncate existing run logs on resume; switch the file
open behavior away from "w" to an atomic create/open strategy with the intended
permissions, and use append semantics if the same run_id must preserve prior
entries. Keep the permission-setting logic tied to the open path to avoid the
post-open TOCTOU window, and adjust the _audit_path/run_id handling only if
needed to support a fresh run_id for truncate-on-new-run behavior.
- Around line 307-318: The worklist variable is initialized as a list but used
with the membership check operator (not in), which performs an O(N) search. This
creates O(N²) overall complexity when looping through rows. Change the worklist
initialization from list[str] = [] to set[str] = set(), and replace the
worklist.append(doc_id) call with worklist.add(doc_id). This will reduce the
membership check to O(1) and overall scan complexity to O(N).
- Around line 368-375: The throttle loop in sanitize.py should always enforce a
minimum cooldown instead of breaking immediately when budget.before_batch()
returns allowed=False with no retry delay. Update the backfill wait logic in the
while True block so a denied decision with retry_after_seconds missing or zero
still sleeps for the required floor (at least 200ms, and never below the
documented 100ms minimum), then retries rather than exiting the loop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e1bb2cf4-19c6-4e77-82ab-a0619221b3b5

📥 Commits

Reviewing files that changed from the base of the PR and between 84a4ea8 and ddd56ea.

📒 Files selected for processing (12)
  • CHANGELOG.md
  • CLAUDE.md
  • README.md
  • cleanup.py
  • rag_engine.py
  • sanitize.py
  • secret_redaction.py
  • tests/test_cleanup_sanitize.py
  • tests/test_sanitize.py
  • tests/test_secret_redaction.py
  • tests/test_tools_sanitize.py
  • tools.py

Comment thread rag_engine.py
Comment thread rag_engine.py
Comment thread tests/test_sanitize.py Outdated
…edup

- sanitize.py: worklist holds doc_ids + value-free metadata only (no
  document payloads cached) → JIT row fetch via rag_engine.get_row_by_doc_id;
  set-based membership (O(N^2)→O(N)); resume loads worklist+run_id from the
  checkpoint without re-scanning. Fixes OOM risk on large indices.
- sanitize.py _throttled_embed: a hard budget deny (allowed=False, no retry)
  now aborts the run (status=paused, checkpoint) instead of embedding —
  no longer bypasses the pause/cap gate.
- sanitize.py _AuditWriter: O_CREAT|O_APPEND|0o600 at open — no truncation
  on resume, no perms-after-create race; resumed runs reuse the run_id.
- rag_engine.py upsert_document/delete_by_doc_id: try/finally so the
  ephemeral FTS connection is always closed (was leaking on exception).
- secret_redaction.py: extract _aggregate_maskable_candidates shared by
  redact() + _maskable_values() — removes the duplication (Codacy metric)
  and guarantees no policy drift between live + retroactive paths.
- tests: parameterize FTS-failure cases, fix stale docstring, +coverage for
  JIT fetch / resume / throttle-abort / audit append.

Full suite 356 passed; live dry-run re-validated (audit value-free).

Refs SESF-42

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR successfully implements the retroactive secret sanitizer with the required dry-run, redaction, and deletion modes, supported by a value-free audit system. All acceptance criteria for functional behavior, including mandatory confirmation and throttled re-embedding, are satisfied. However, two critical issues prevent immediate approval: a likely runtime NameError in rag_engine.py regarding text truncation and a blocking synchronous operation in the MCP tool handler that would make the server unresponsive. Furthermore, the Milvus integration should be optimized to use integer primary keys rather than string identifiers to ensure performance and prevent logic desynchronization.

About this PR

  • There is a systemic pattern of duplicating primary key derivation and document fetching logic in rag_engine.py. Centralizing these into private helpers will reduce the risk of desynchronization between the ingestion, sanitization, and deletion paths.

Test suggestions

  • Sanitize scan (dry-run) identifies affected turns and generates a value-free report without writes.
  • Sanitize apply (redact) successfully updates Milvus, re-embeds text via throttled budget, and rewrites FTS metadata.
  • Sanitize apply (drop) successfully deletes turns from both Milvus and FTS stores.
  • Sanitizer refuses to perform destructive operations (apply/drop) without explicit confirmation.
  • Sanitizer audit trail and stdout/MCP responses never contain raw secret values (masking verification).
  • Sanitizer resumes from a checkpoint correctly after an interruption or budget-pause.
  • FTS failure during apply leaves the turn on the worklist for retry/convergence.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread tools.py Outdated
Comment thread rag_engine.py Outdated
Comment thread rag_engine.py
Comment thread rag_engine.py
Comment thread tests/test_tools_sanitize.py
Comment thread rag_engine.py Outdated
lbruton added 2 commits June 13, 2026 18:20
A redacted payload can expand past Milvus's 65535-byte VARCHAR cap; the FTS
insert must index the same truncated text Milvus stores or the two stores
diverge. +regression test for the >64KB case. (CodeRabbit)

Refs SESF-42
… lookup

- tools.py: sanitize_index offloads scan/apply via asyncio.to_thread so a
  long-running apply no longer blocks the MCP event loop.
- rag_engine.py: extract _pk_from_doc_id (centralizes the sha256[:15] PK
  derivation across insert/delete/get); get_row_by_doc_id + upsert_document
  now look up by integer PK (id == <pk>) for O(1) instead of a VARCHAR scan.
- tests: assert apply(confirmed=False) performs no reads (spies _query_batches
  + get_row_by_doc_id), not just no writes.

Refs SESF-42
@lbruton lbruton merged commit f0a35cd into main Jun 13, 2026
5 checks passed
@lbruton lbruton deleted the feat/SESF-42-retroactive-secret-sanitizer branch June 13, 2026 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants