feat(SESF-42): retroactive secret sanitizer (CLI + MCP) by lbruton · Pull Request #40 · lbruton/SessionFlow

lbruton · 2026-06-13T22:39:16Z

Summary

Adds an operator-driven sanitizer that removes secrets already indexed in the Milvus document field, the FTS5 content column, and the embedding vector — the retroactive half of the SESF-35 research (SESF-41 shipped the ingestion guard). Exposed via a cleanup.py sanitize CLI subcommand and a dedicated sanitize_index MCP tool.

Dry-run is the default — reports per-rule counts + affected turns and writes a value-free 0600 audit JSONL (~/.sessionflow/audit/); zero writes to any store.
--apply --yes redacts in place and re-embeds the redacted text (throttled through the embedding budget, 200ms floor; checkpointed/resumable). --apply --yes --drop deletes affected turns instead.
Both surfaces refuse to apply without explicit confirmation and never emit a secret value.

New primitives

secret_redaction.scan_spans — per-occurrence, value-free audit spans (reuses the SESF-41 detector); snippets mask every detected span in-window.
rag_engine.upsert_document (Milvus upsert-by-PK + FTS metadata-preserving rewrite) and delete_by_doc_id (DeleteResult surfaces FTS outcome). FTS-rewrite/-delete failure keeps a turn retryable, never silently "done".

Linked issue

Plane SESF-42 — Retroactive secret sanitizer for already-indexed data (CLI + MCP)

Tests

pytest full suite: 349 passed / 0 failed (300 baseline + 49 new). ruff clean.
TDD: 26 failing tests written before implementation; +5 regression tests from CodeRabbit review.
Codacy local SAST: PASS (0 introduced Critical/High).
CodeRabbit: 3 correctness findings (empty-vector-on-resume, FTS metadata loss, drop done-marking) — all fixed + regression-tested.

Live exposure baseline (dry-run, no writes)

Ran against the live Standalone index: 1296 affected turns (HIGH_ENTROPY 6136, ASSIGNMENT 882; codex 5628 / claude_code_cli 980 / antigravity_desktop 203 / opencode 118 / antigravity_cli 89). Audit verified value-free on real data (7018 records, 0 raw-value fields, 0 leaking snippets, 0600).

Out of scope

Destructive --apply on the live index (a separate, deliberate operator action).
Upstream source-transcript rewriting (SessionFlow sanitizes only what it owns).

Version

No version bump — SessionFlow has no version-lock system.

Redaction is irreversible and is not a substitute for rotation — rotate any key that was ever indexed.

Summary by CodeRabbit

New Features
- Added a retroactive secret sanitization workflow to find previously stored secrets and either redact them or remove affected turns.
- Dry-run is the default, with explicit confirmation required for any destructive action.
- Added scope filters to target specific projects, sessions, providers, or date ranges.
- Sanitization reports now show counts, status, and audit details without exposing secret values.

Add an operator-driven sanitizer that removes secrets already indexed in Milvus (document), FTS5 (content), and the embedding vector. - secret_redaction.scan_spans: per-occurrence, value-free audit spans (rule/tier/offset/length/masked_snippet); masked_snippet masks every detected span in its window (no raw value, any tier). - rag_engine.upsert_document / delete_by_doc_id: Milvus upsert-by-PK + FTS delete-then-insert (redact) / dual delete (drop); FTS failure surfaced distinctly so a row stays retryable. - sanitize.py: scan (dry-run) / apply (redact|drop) orchestrator with worklist+done checkpoint, throttled re-embed (200ms floor), and 0600 value-free audit JSONL. - cleanup.py sanitize subcommand (refuse without --yes) + tools.py sanitize_index MCP tool (refuse apply without confirm). - Tests: 44 new (scan_spans, orchestrator, primitives, CLI, MCP). Live dry-run baseline: 1296 affected turns (HIGH_ENTROPY 6136, ASSIGNMENT 882); audit verified value-free on real data. Refs SESF-42

- upsert_document: new_vector now Optional; None preserves the stored vector (resume/FTS-only converge no longer corrupts the 768-dim row). - upsert_document: FTS rewrite rebuilds the full metadata record from the fetched row (was dropping ~13 columns → degraded filtered/BM25 search). - delete_by_doc_id now returns DeleteResult(deleted, fts_ok); drop path marks a turn done only when both Milvus and FTS deletes succeed, mirroring the redact FTS-incomplete contract. - _build_worklist caches scanned rows so the resume/no-spans path is reachable. - tools.py refusal wording reflects drop; sanitize.apply pause_event typed. - +5 tests for the previously-mocked vector/metadata/drop-FTS paths. Refs SESF-42

coderabbitai · 2026-06-13T22:39:22Z

Warning

Review limit reached

@lbruton, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 28 minutes and 39 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 44f2194a-338b-4380-b848-4a43fcb44f02

📥 Commits

Reviewing files that changed from the base of the PR and between ddd56ea and e0fe9b1.

📒 Files selected for processing (6)

rag_engine.py
sanitize.py
secret_redaction.py
tests/test_sanitize.py
tests/test_secret_redaction.py
tools.py

Walkthrough

Implements SESF-42: a retroactive secret sanitizer that detects previously-indexed secrets in Milvus and FTS5. Adds secret_redaction.scan_spans(), dual-write rag_engine.upsert_document/delete_by_doc_id primitives, a sanitize.py orchestrator with dry-run/apply/drop modes, a cleanup.py sanitize CLI subcommand, and a sanitize_index MCP tool, each with comprehensive tests and docs.

Changes

SESF-42 Retroactive Secret Sanitizer

Layer / File(s)	Summary
Span detection primitive `secret_redaction.py`, `tests/test_secret_redaction.py`	Adds `Span` NamedTuple and public `scan_spans(text, *, mode, allowlist)` with internal `_maskable_values()` and `_collect_spans()`. Returns per-occurrence spans with value-free `masked_snippet`; identical span list across `"report"` and `"enforce"` modes. Tests assert per-occurrence semantics, offset accuracy, tier assignment, snippet value-freedom (including neighbor-overlap and keyword-outside-window edge cases), determinism, and `redact()` contract stability.
Dual-write RAG engine primitives `rag_engine.py`, `tests/test_sanitize.py` (primitive harness)	Adds `UpsertResult`/`DeleteResult` NamedTuples and `upsert_document()`/`delete_by_doc_id()`. Each function writes Milvus then FTS5 and surfaces `milvus_ok`/`fts_ok` independently; `new_vector=None` preserves the existing stored vector. Primitive contract tests verify Milvus upsert + FTS delete-then-insert ordering, metadata copying, distinct failure reporting, and dual-delete behavior.
sanitize.py orchestrator `sanitize.py`, `tests/test_sanitize.py` (orchestration tests)	New module with `Scope` (filter builder), `SanitizeReport`, keyset-batched `scan()`, and `apply()` with per-turn checkpointing, FTS-failure incomplete tracking, pause-event support, throttled re-embedding, and value-free JSONL audit trail under `~/.sessionflow/audit/` (0600). `_apply_redact()` re-embeds redacted text; `_apply_drop()` deletes instead. Tests cover scope filtering, dry-run no-writes contract, redact/drop apply, confirmation gate, FTS-failure incompletion, resume/skip semantics, budget throttling, and no-leak invariants.
CLI subcommand `cleanup.py`, `tests/test_cleanup_sanitize.py`	Adds `cmd_sanitize()` and `_print_sanitize_report()` plus parser wiring for `sanitize` (mutually exclusive `--dry-run`/`--apply`, `--drop`, `--yes`, and scoping flags). `--apply` requires `--yes`; `--drop` requires `--apply`. Tests cover parser registration, default dry-run posture, confirmation gate refusal, apply dispatch, scope flag mapping, and CLI no-leak invariant.
MCP tool adapter `tools.py`, `tests/test_tools_sanitize.py`	Adds `format_sanitize_report()` markdown formatter, `sanitize_index` tool registration (schema: `apply`, `drop`, `confirm`, scoping fields — all optional, none required), and `call_tool` dispatch with scan/refuse/apply routing. Tests assert schema shape, default scan path, confirm-gate refusal, apply-with-confirm flag threading, drop propagation, and no-leak invariant across all response paths.
Feature docs `README.md`, `CLAUDE.md`, `CHANGELOG.md`	Adds README section on `cleanup.py sanitize` usage (dry-run, `--apply`/`--drop`/`--yes`, scoping, audit trail format, rotation warning). CLAUDE.md adds SESF-42 operational notes on FTS retry semantics, irreversibility, and checkpoint locations. CHANGELOG records new primitives and behaviors.

Sequence Diagram(s)

sequenceDiagram
  participant Operator
  participant cleanup.py / sanitize_index
  participant sanitize.py
  participant secret_redaction
  participant rag_engine
  participant Milvus
  participant FTS5

  rect rgba(100, 100, 200, 0.5)
    note over Operator,FTS5: Dry-run (scan)
    Operator->>cleanup.py / sanitize_index: sanitize (no --apply / apply=false)
    cleanup.py / sanitize_index->>sanitize.py: scan(scope)
    sanitize.py->>Milvus: keyset-batch query in-scope rows
    Milvus-->>sanitize.py: rows
    sanitize.py->>secret_redaction: scan_spans(document, mode="report")
    secret_redaction-->>sanitize.py: (text, spans)
    sanitize.py-->>cleanup.py / sanitize_index: SanitizeReport(mode="dry-run", audit_path)
    cleanup.py / sanitize_index-->>Operator: counts + audit path (no secret values)
  end

  rect rgba(100, 200, 100, 0.5)
    note over Operator,FTS5: Apply (redact or drop)
    Operator->>cleanup.py / sanitize_index: sanitize --apply --yes / apply=true, confirm=true
    cleanup.py / sanitize_index->>sanitize.py: apply(scope, drop, confirmed=True)
    loop per doc_id in worklist
      alt drop=False
        sanitize.py->>secret_redaction: scan_spans(document, mode="enforce")
        secret_redaction-->>sanitize.py: (redacted_text, spans)
        sanitize.py->>rag_engine: upsert_document(doc_id, redacted_text, new_vector)
        rag_engine->>Milvus: upsert row
        rag_engine->>FTS5: delete then insert
        rag_engine-->>sanitize.py: UpsertResult(milvus_ok, fts_ok)
      else drop=True
        sanitize.py->>rag_engine: delete_by_doc_id(doc_id)
        rag_engine->>Milvus: delete row
        rag_engine->>FTS5: delete row
        rag_engine-->>sanitize.py: DeleteResult(deleted, fts_ok)
      end
      sanitize.py->>sanitize.py: checkpoint after each turn
    end
    sanitize.py-->>cleanup.py / sanitize_index: SanitizeReport(rotate_warning=True)
    cleanup.py / sanitize_index-->>Operator: report (no secret values)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

lbruton/SessionFlow#37: Introduced the tiered secret redaction engine (secret_redaction.py, redact(), Hit) that this PR directly extends with the new Span type and scan_spans() per-occurrence audit API.

Poem

Old secrets crept inside the index, hiding in the dark,
Now scan_spans hunts them one by one and leaves a value-free mark.
With --apply --yes the redact path rewrites and re-embeds,
Or --drop deletes the row entire — no trace of what it said.
Checkpointed, throttled, audit-logged, with FTS errors tracked,
The secrets gone, the index clean — rotation still exact. 🔑

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding a retroactive secret sanitizer exposed through CLI and MCP.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/SESF-42-retroactive-secret-sanitizer

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codacy-production · 2026-06-13T22:40:49Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 315 complexity · 4 duplication

Metric Results

Complexity ✅ 315 (≤ 500 complexity)

Duplication ✅ 4 (≤ 5 duplication)

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

_{TIP This summary will be updated as you push new changes.}

codacy-production

Pull Request Overview

The PR is currently not up to standards due to critical scalability risks and a failure to correctly implement hardware-constrained throttling. While the core functionality for retroactive sanitization is present, the implementation contains a high-severity risk of Out-Of-Memory (OOM) errors when processing large indices because full document payloads are cached in memory.

Additionally, the re-embedding throttle contains a logic flaw that allows it to bypass safety waits, violating the requirement for hardware-aligned budget management. Finally, there is significant logic duplication between real-time and retroactive scanning paths, which must be refactored to ensure consistent secret prioritization and maintainability.

About this PR

The orchestration logic for scanning rows is not scalable. Caching full document payloads for every turn in a repository will lead to OOM crashes during large-scale index cleaning. The system should be refactored to process payloads just-in-time or only cache lightweight metadata.
There is a systemic duplication of secret candidate aggregation and filtering logic between the live redaction module and this new retroactive scanner. This creates a high risk of 'behavioral drift' where real-time and retroactive scanning apply different rules to the same content.

Test suggestions

Verify that a dry-run identifies secrets and generates an audit log without modifying Milvus or FTS.
Verify that apply mode refuses to execute and returns an error/status when confirmation is missing.
Verify that the audit log and CLI/MCP output contain only rule names, counts, and masked snippets, never raw secret values.
Verify that a resumed run skips doc_ids already marked 'done' in the checkpoint.
Verify that re-embedding calls the embedding budget's before_batch/after_batch for throttling.
Verify that FTS failures during a redact or drop operation leave the turn as retryable (not marked done).
Verify that upsert_document preserves all Milvus metadata fields while updating the document and vector.

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

codacy-production

Pull Request Overview

The pull request is currently not up to standards due to high-risk performance issues and security vulnerabilities. Specifically, the _build_worklist function in sanitize.py poses a significant risk of OutOfMemoryError when operating on large Milvus indices because it attempts to cache full document content in memory.

From a security and integrity perspective, the audit logging mechanism contains a race condition in permission handling and incorrectly truncates existing logs when a sanitization run is resumed. These issues must be addressed to ensure a secure and reliable audit trail. While the implementation successfully meets most acceptance criteria regarding dry-run defaults and confirmation prompts, the code duplication identified in redaction logic should also be resolved to prevent policy drift.

Test suggestions

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

Copilot

Pull request overview

Adds SESF-42 “retroactive secret sanitizer” capability to remove secrets that were already indexed (Milvus document, FTS5 content, and derived embeddings), exposed via both the cleanup.py sanitize CLI and a new sanitize_index MCP tool. This extends the existing SESF-41 ingestion-time guard by enabling operator-driven cleanup of historical data while preserving the “no secret value emitted” invariant via span-aware, value-free auditing.

Changes:

Introduces secret_redaction.scan_spans and span/snippet data model to support per-occurrence, value-free auditing.
Adds sanitizer orchestrator (sanitize.py) plus CLI/MCP adapters (confirmation-gated apply; dry-run default) and extensive tests.
Adds rag_engine.upsert_document / delete_by_doc_id primitives to rewrite/delete across Milvus + FTS with distinct FTS success reporting.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tools.py	Registers `sanitize_index` MCP tool and adds markdown formatting for sanitize reports.
sanitize.py	Implements scan/apply orchestration, audit JSONL writing, checkpointing, and embedding throttling.
secret_redaction.py	Adds `Span` + `scan_spans()` for per-occurrence, value-free audit spans/snippets.
rag_engine.py	Adds `upsert_document` / `delete_by_doc_id` primitives and result types for dual-store operations.
cleanup.py	Adds `cleanup.py sanitize` subcommand and value-free report printing with confirmation gate.
README.md	Documents retroactive sanitizer usage, flags, and audit/no-leak guarantees.
CHANGELOG.md	Notes the new SESF-42 sanitizer feature set and operator warnings.
tests/test_tools_sanitize.py	Tests MCP adapter wiring, confirmation gating, and no-leak output contract.
tests/test_cleanup_sanitize.py	Tests CLI wiring, refusal-before-reads/writes semantics, and no-leak output.
tests/test_secret_redaction.py	Adds regression coverage for span/snippet masking invariants in `scan_spans`.
tests/test_sanitize.py	Adds orchestrator + rag_engine primitive tests for scan/apply/drop/resume/budget behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (3)

sanitize.py (3)

243-252: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Audit file truncates existing logs on resume and has a TOCTOU race on permissions.

Opening with "w" mode destroys any prior audit entries for this run_id (e.g. on resume). The chmod after open leaves a window where the file exists with default umask permissions. Use atomic open with mode:

-        self._handle = open(self.path, "w", encoding="utf-8")
-        try:
-            os.chmod(self.path, _FILE_MODE)
-        except OSError:
-            pass
+        fd = os.open(self.path, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, _FILE_MODE)
+        self._handle = os.fdopen(fd, "w", encoding="utf-8")

For resume scenarios, consider append mode (O_APPEND) instead of truncate, or use a fresh run_id.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 243 - 252, Update the audit file initialization in
sanitize.py’s __init__ so it does not truncate existing run logs on resume;
switch the file open behavior away from "w" to an atomic create/open strategy
with the intended permissions, and use append semantics if the same run_id must
preserve prior entries. Keep the permission-setting logic tied to the open path
to avoid the post-open TOCTOU window, and adjust the _audit_path/run_id handling
only if needed to support a fresh run_id for truncate-on-new-run behavior.

307-318: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

O(N) membership check on list creates O(N²) scan complexity.

doc_id not in worklist is O(N) for lists. With large indices this becomes a bottleneck. Use a set for tracking seen doc_ids:

     counts: dict = {}
-    worklist: list[str] = []
+    worklist: list[str] = []
+    worklist_set: set[str] = set()
     audit = _AuditWriter(run_id)
     try:
         for row in _iter_rows(scope, db_path):
             ...
             doc_id = row.get("doc_id")
-            if doc_id not in worklist:
+            if doc_id not in worklist_set:
+                worklist_set.add(doc_id)
                 worklist.append(doc_id)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 307 - 318, The worklist variable is initialized as
a list but used with the membership check operator (not in), which performs an
O(N) search. This creates O(N²) overall complexity when looping through rows.
Change the worklist initialization from list[str] = [] to set[str] = set(), and
replace the worklist.append(doc_id) call with worklist.add(doc_id). This will
reduce the membership check to O(1) and overall scan complexity to O(N).

368-375: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Throttle loop exits without enforcing the 200ms cooldown floor when budget denies but provides no delay.

When allowed=False and retry_after_seconds is 0/None, the loop breaks immediately without any wait. Per CLAUDE.md and retrieved learnings, the embedding backfill must never reduce cooldown below 100ms due to MLX Metal SIGSEGV risk.

             delay = getattr(decision, "retry_after_seconds", 0.0) or 0.0
-            if delay <= 0:
-                break
+            if delay <= 0:
+                delay = 0.2  # enforce 200ms floor per CLAUDE.md
             time.sleep(delay)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sanitize.py` around lines 368 - 375, The throttle loop in sanitize.py should
always enforce a minimum cooldown instead of breaking immediately when
budget.before_batch() returns allowed=False with no retry delay. Update the
backfill wait logic in the while True block so a denied decision with
retry_after_seconds missing or zero still sleeps for the required floor (at
least 200ms, and never below the documented 100ms minimum), then retries rather
than exiting the loop.

Sources: Coding guidelines, Learnings

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rag_engine.py`:
- Around line 2041-2046: Update rag_engine.upsert_document so the FTS rewrite
uses the same UTF-8 truncated text stored in row["document"] rather than
new_document; keep the Milvus upsert and metadata handling unchanged, but pass
the truncated document value into the FTS insert path at the anchor site and the
sibling site (rag_engine.py 2041-2046 and 2061-2063) to preserve dual-write
alignment.
- Around line 2057-2085: The FTS rewrite logic in rag_engine.py only closes the
ephemeral SQLite connection on the success path; update both rewrite blocks
around _fts.connection(), _fts.delete(), and _fts.insert() so
_fts.close_ephemeral(conn) runs in a finally block (or equivalent) whenever conn
is created, including when delete/insert raises. Keep fts_ok handling and
logging intact, but ensure the connection is always released on failure paths.

In `@tests/test_sanitize.py`:
- Around line 306-317: In the test function
test_apply_without_confirmation_makes_no_calls, add an assertion that verifies
no reads were performed on the stubbed engine when confirmed=False is passed to
sanitize.apply(). Currently the test only asserts that writes (upserts, deletes,
embed_inputs) are blocked, but does not verify that the code skips reading from
the index altogether. Add an assertion checking the appropriate field in the cap
object that tracks read/search operations to ensure that reads are also blocked
when confirmation is not provided, consistent with the security requirement that
destructive sanitize operations must refuse unless explicitly confirmed.

---

Duplicate comments:
In `@sanitize.py`:
- Around line 243-252: Update the audit file initialization in sanitize.py’s
__init__ so it does not truncate existing run logs on resume; switch the file
open behavior away from "w" to an atomic create/open strategy with the intended
permissions, and use append semantics if the same run_id must preserve prior
entries. Keep the permission-setting logic tied to the open path to avoid the
post-open TOCTOU window, and adjust the _audit_path/run_id handling only if
needed to support a fresh run_id for truncate-on-new-run behavior.
- Around line 307-318: The worklist variable is initialized as a list but used
with the membership check operator (not in), which performs an O(N) search. This
creates O(N²) overall complexity when looping through rows. Change the worklist
initialization from list[str] = [] to set[str] = set(), and replace the
worklist.append(doc_id) call with worklist.add(doc_id). This will reduce the
membership check to O(1) and overall scan complexity to O(N).
- Around line 368-375: The throttle loop in sanitize.py should always enforce a
minimum cooldown instead of breaking immediately when budget.before_batch()
returns allowed=False with no retry delay. Update the backfill wait logic in the
while True block so a denied decision with retry_after_seconds missing or zero
still sleeps for the required floor (at least 200ms, and never below the
documented 100ms minimum), then retries rather than exiting the loop.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e1bb2cf4-19c6-4e77-82ab-a0619221b3b5

📥 Commits

Reviewing files that changed from the base of the PR and between 84a4ea8 and ddd56ea.

📒 Files selected for processing (12)

CHANGELOG.md
CLAUDE.md
README.md
cleanup.py
rag_engine.py
sanitize.py
secret_redaction.py
tests/test_cleanup_sanitize.py
tests/test_sanitize.py
tests/test_secret_redaction.py
tests/test_tools_sanitize.py
tools.py

…edup - sanitize.py: worklist holds doc_ids + value-free metadata only (no document payloads cached) → JIT row fetch via rag_engine.get_row_by_doc_id; set-based membership (O(N^2)→O(N)); resume loads worklist+run_id from the checkpoint without re-scanning. Fixes OOM risk on large indices. - sanitize.py _throttled_embed: a hard budget deny (allowed=False, no retry) now aborts the run (status=paused, checkpoint) instead of embedding — no longer bypasses the pause/cap gate. - sanitize.py _AuditWriter: O_CREAT|O_APPEND|0o600 at open — no truncation on resume, no perms-after-create race; resumed runs reuse the run_id. - rag_engine.py upsert_document/delete_by_doc_id: try/finally so the ephemeral FTS connection is always closed (was leaking on exception). - secret_redaction.py: extract _aggregate_maskable_candidates shared by redact() + _maskable_values() — removes the duplication (Codacy metric) and guarantees no policy drift between live + retroactive paths. - tests: parameterize FTS-failure cases, fix stale docstring, +coverage for JIT fetch / resume / throttle-abort / audit append. Full suite 356 passed; live dry-run re-validated (audit value-free). Refs SESF-42

codacy-production

Pull Request Overview

This PR successfully implements the retroactive secret sanitizer with the required dry-run, redaction, and deletion modes, supported by a value-free audit system. All acceptance criteria for functional behavior, including mandatory confirmation and throttled re-embedding, are satisfied. However, two critical issues prevent immediate approval: a likely runtime NameError in rag_engine.py regarding text truncation and a blocking synchronous operation in the MCP tool handler that would make the server unresponsive. Furthermore, the Milvus integration should be optimized to use integer primary keys rather than string identifiers to ensure performance and prevent logic desynchronization.

About this PR

There is a systemic pattern of duplicating primary key derivation and document fetching logic in rag_engine.py. Centralizing these into private helpers will reduce the risk of desynchronization between the ingestion, sanitization, and deletion paths.

Test suggestions

Sanitize scan (dry-run) identifies affected turns and generates a value-free report without writes.
Sanitize apply (redact) successfully updates Milvus, re-embeds text via throttled budget, and rewrites FTS metadata.
Sanitize apply (drop) successfully deletes turns from both Milvus and FTS stores.
Sanitizer refuses to perform destructive operations (apply/drop) without explicit confirmation.
Sanitizer audit trail and stdout/MCP responses never contain raw secret values (masking verification).
Sanitizer resumes from a checkpoint correctly after an interruption or budget-pause.
FTS failure during apply leaves the turn on the worklist for retry/convergence.

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

A redacted payload can expand past Milvus's 65535-byte VARCHAR cap; the FTS insert must index the same truncated text Milvus stores or the two stores diverge. +regression test for the >64KB case. (CodeRabbit) Refs SESF-42

… lookup - tools.py: sanitize_index offloads scan/apply via asyncio.to_thread so a long-running apply no longer blocks the MCP event loop. - rag_engine.py: extract _pk_from_doc_id (centralizes the sha256[:15] PK derivation across insert/delete/get); get_row_by_doc_id + upsert_document now look up by integer PK (id == <pk>) for O(1) instead of a VARCHAR scan. - tests: assert apply(confirmed=False) performs no reads (spies _query_batches + get_row_by_doc_id), not just no writes. Refs SESF-42

lbruton added 3 commits June 13, 2026 17:20

docs(SESF-42): changelog entry for the retroactive secret sanitizer

903bd1d

Copilot AI review requested due to automatic review settings June 13, 2026 22:39

Copilot started reviewing on behalf of lbruton June 13, 2026 22:39 View session

docs(SESF-42): add retroactive sanitizer operational gotcha to CLAUDE.md

ddd56ea

codacy-production Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread sanitize.py Outdated

Comment thread secret_redaction.py Outdated

Comment thread sanitize.py Outdated

Comment thread sanitize.py

codacy-production Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread sanitize.py

Comment thread secret_redaction.py Outdated

Comment thread sanitize.py Outdated

Comment thread sanitize.py Outdated

Comment thread tests/test_sanitize.py Outdated

Copilot AI reviewed Jun 13, 2026

View reviewed changes

Comment thread sanitize.py Outdated

Comment thread rag_engine.py

Comment thread rag_engine.py

Comment thread tests/test_sanitize.py

lbruton marked this pull request as ready for review June 13, 2026 23:01

Copilot AI review requested due to automatic review settings June 13, 2026 23:01

Copilot started reviewing on behalf of lbruton June 13, 2026 23:02 View session

lbruton removed the request for review from Copilot June 13, 2026 23:05

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread rag_engine.py

Comment thread rag_engine.py

Comment thread tests/test_sanitize.py Outdated

codacy-production Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread tools.py Outdated

Comment thread rag_engine.py Outdated

Comment thread rag_engine.py

Comment thread rag_engine.py

Comment thread tests/test_tools_sanitize.py

Comment thread rag_engine.py Outdated

lbruton added 2 commits June 13, 2026 18:20

lbruton merged commit f0a35cd into main Jun 13, 2026
5 checks passed

lbruton deleted the feat/SESF-42-retroactive-secret-sanitizer branch June 13, 2026 23:29

Conversation

lbruton commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New primitives

Linked issue

Tests

Live exposure baseline (dry-run, no writes)

Out of scope

Version

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codacy-production Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

About this PR

Test suggestions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

Test suggestions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

About this PR

Test suggestions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

lbruton commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

codacy-production Bot commented Jun 13, 2026 •

edited

Loading