[codex] Add DeepSec PR scanning#2110
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds DeepSec security scanning infrastructure: a Bun workspace with DeepSec config, project threat model and guidance, operational docs and setup, a local OpenAI proxy that enforces budgets and request validation, and a GitHub Actions workflow that runs sanitized scans on pull requests and publishes findings. ChangesDeepSec Security Scanning Setup
Sequence DiagramssequenceDiagram
participant GitHubWF
participant DeepSec
participant Proxy
participant OpenAI
GitHubWF->>DeepSec: Start deepsec process (container) with PROXY env
DeepSec->>Proxy: POST /v1/chat/completions (Authorization: Bearer clientToken)
Proxy->>Proxy: Validate client token & model, cap output tokens
Proxy->>OpenAI: Forward request with OpenAI token
OpenAI-->>Proxy: Stream response
Proxy-->>DeepSec: Stream capped response back
sequenceDiagram
participant PR as Pull Request
participant WF as GitHub Actions
participant Scanner as Trusted Scanner Checkout
participant Target as PR Checkout (scan-target)
participant Proxy as AI Proxy
participant DeepSec
participant Comments as PR Comments
WF->>Scanner: Checkout scanner at base SHA
WF->>Target: Checkout PR head and build sanitized scan-target
WF->>Proxy: Start proxy and wait for /health
WF->>DeepSec: Run deepsec process in container (proxy configured)
DeepSec->>Proxy: Request completions/responses
Proxy->>DeepSec: Stream capped responses
DeepSec->>WF: Emit comment.md
WF->>Comments: Upload artifact and upsert PR comment with marker
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Merging this PR will not alter performance
Comparing Footnotes
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
.github/workflows/deepsec.yml (1)
83-93: 💤 Low valueConsider adding a brief diagnostic message before failing.
If the proxy fails to start within the 5-second polling window, the log file is dumped, but there's no explicit message indicating what went wrong. Adding a brief diagnostic could help with troubleshooting.
Suggested improvement
if ! curl -fsS http://127.0.0.1:8787/health >/dev/null 2>&1; then + echo "AI gateway proxy failed to start within 5 seconds." cat "$RUNNER_TEMP/ai-gateway-proxy.log" exit 1 fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/deepsec.yml around lines 83 - 93, When the health check loop times out, add a concise diagnostic log message before dumping the log and exiting: detect the failure branch after the retry loop that checks http://127.0.0.1:8787/health and emit a clear message (e.g., "ai-gateway-proxy failed to become healthy after 5s; dumping log at $RUNNER_TEMP/ai-gateway-proxy.log") to stderr or echo so the reason is explicit, then proceed to cat "$RUNNER_TEMP/ai-gateway-proxy.log" and exit 1; update the failure branch around the existing curl check and log file dump accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.deepsec/AGENTS.md:
- Around line 12-14: Update the command example that currently reads "deepsec
init-project <root>" to use the workspace-consistent invocation "bun deepsec
init-project <root>" (matching the pattern used in `.deepsec/README.md`); search
for the exact token "deepsec init-project <root>" and replace it with "bun
deepsec init-project <root>" so documentation is consistent across files.
In @.deepsec/README.md:
- Around line 53-64: The fenced code block in .deepsec/README.md is missing a
language tag (triggering MD040); update the opening fence from ``` to include a
language like ```text so the block becomes fenced with a language specifier;
modify the fenced block that starts at the snapshot containing
"deepsec.config.ts Project list (one entry per scanned repo)" to use
```text and keep the rest of the block unchanged.
In @.github/scripts/ai-gateway-proxy.mjs:
- Around line 83-110: The upstream fetch (the call that assigns upstreamResponse
using fetch(upstreamUrl, { method: req.method, headers, body })) must be made
cancellable and timeout-able: create an AbortController, pass its signal to
fetch, set a configurable timeout from an env var (e.g.,
AI_GATEWAY_UPSTREAM_TIMEOUT_MS defaulting to 30000) that calls
controller.abort() when fired, and register req.on('close', ...) to call
controller.abort() if the client disconnects; ensure you clear the timeout timer
on both successful response/stream completion and on errors, and use the same
controller to abort the reader/stream handling (references: upstreamResponse,
fetch(..., { signal }), upstreamUrl, headers, body, res, reader).
---
Nitpick comments:
In @.github/workflows/deepsec.yml:
- Around line 83-93: When the health check loop times out, add a concise
diagnostic log message before dumping the log and exiting: detect the failure
branch after the retry loop that checks http://127.0.0.1:8787/health and emit a
clear message (e.g., "ai-gateway-proxy failed to become healthy after 5s;
dumping log at $RUNNER_TEMP/ai-gateway-proxy.log") to stderr or echo so the
reason is explicit, then proceed to cat "$RUNNER_TEMP/ai-gateway-proxy.log" and
exit 1; update the failure branch around the existing curl check and log file
dump accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b7d7c74f-a0c7-49af-b782-d1bfda9e1d3f
⛔ Files ignored due to path filters (1)
.deepsec/bun.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
.deepsec/.gitignore.deepsec/AGENTS.md.deepsec/README.md.deepsec/data/capgo/INFO.md.deepsec/data/capgo/SETUP.md.deepsec/data/capgo/config.json.deepsec/deepsec.config.ts.deepsec/package.json.github/scripts/ai-gateway-proxy.mjs.github/workflows/deepsec.yml
|
I think the workflow currently leaves the highest-risk PRs outside the new security gate. if: github.event.pull_request.head.repo.full_name == github.repositorySo every forked contributor PR skips the scan entirely, and the If the scanner path is safe to run on fork content as data, I would remove that same-repo guard and keep using the base-branch checkout plus proxy hardening. If forks must remain excluded, I would add a visible fallback check/comment that says DeepSec was intentionally skipped for fork PRs; otherwise maintainers may assume the new gate covered a PR when it never ran. |
KCDaemon
left a comment
There was a problem hiding this comment.
Rechecked the latest head (6883a57). The workflow is clean/green, but I think the current gating leaves the most important PR class outside the new DeepSec check.
deepsec.yml runs on pull_request_target and carefully checks out the trusted scanner from base.sha, but the analyze job has if: github.event.pull_request.head.repo.full_name == github.repository. That means every fork PR skips analyze entirely. The comment job also only runs when needs.analyze.result == 'failure', so a skipped fork scan produces no visible fallback comment/check explaining that DeepSec did not run.
For an OSS repo, fork PRs are the main untrusted contribution path and the place where a security PR scanner is most useful. If the scanner is safe because PR code is checked out as target data while secrets stay behind the local proxy, I would remove the same-repo guard. If fork scans must stay disabled, the workflow should create a visible skipped/fallback result so maintainers do not assume DeepSec covered those PRs.
git diff --check origin/main...origin/pr-2110 passes locally.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Addressed across 5f7d35d, 9203d1d, 932bd22, and 7780420.
Local validation passed: focused DeepSec process returned 0 findings after the latest workflow/proxy changes, proxy smoke test covered auth/path/model/cost-policy rejection, plus |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/scripts/openai-token-proxy.mjs:
- Around line 75-78: The proxy currently only caps 'max_completion_tokens' (via
capTokenField) for non-/v1/responses requests, letting clients pass 'max_tokens'
for chat completions; update the else branch where pathname !== '/v1/responses'
to call capTokenField(json, 'max_tokens') before capTokenField(json,
'max_completion_tokens') so the proxy enforces the output-token budget for
OpenAI Chat Completions requests (refer to the pathname variable and
capTokenField function to locate the change).
In @.github/workflows/deepsec.yml:
- Around line 41-55: The current step builds the changed-files list using
BASE_SHA (github.event.pull_request.base.sha), which is the branch tip at PR
creation and can cause incorrect diffs; replace the left side of the diff with
the PR merge base instead: compute MERGE_BASE by calling git merge-base against
the target branch tip (e.g., origin/${{ github.base_ref }} and HEAD) or use the
GitHub PR files API to enumerate files, then use that MERGE_BASE in the git diff
command that writes to changed-files.txt (update references to BASE_SHA in the
git cat-file/fetch/diff logic accordingly and remove the brittle
fetch-from-origin fallback).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d115cb30-7fbf-48b9-bcd4-8740d6cc1dcd
📒 Files selected for processing (3)
.deepsec/README.md.github/scripts/openai-token-proxy.mjs.github/workflows/deepsec.yml
✅ Files skipped from review due to trivial changes (1)
- .deepsec/README.md
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/deepsec.yml:
- Around line 217-235: The job 'comment' is currently gated by "if: always() &&
needs.analyze.result == 'failure'", which prevents reconciliation when analyze
later succeeds; remove the "needs.analyze.result == 'failure'" part so the
'comment' job runs on every analyze completion (keep always()), keep the
existing step with id download-comment to fetch the artifact and keep the Upsert
PR comment step conditioned on "if: steps.download-comment.outcome == 'success'"
to upsert when comment.md exists, and add a new step (e.g., "Clear PR comment"
or similar) that runs when download-comment.outcome != 'success' to delete or
clear the existing <!-- capgo-deepsec-findings --> marker comment (use the same
permissions and PR write actions) so the marker is removed when no comment.md
artifact was produced.
- Around line 111-128: The script currently copies every regular file into
scan-target via git -C target show "$HEAD_SHA:$file" without size checks; before
writing each blob, use git -C target cat-file -s "$HEAD_SHA:$file" to get the
blob size and enforce a per-file cap (e.g., MAX_FILE_BYTES) and track a
cumulative counter (e.g., total_bytes) to enforce a total cap (e.g.,
MAX_TOTAL_BYTES); if the file exceeds per-file or would push total over the cap,
skip it and log a warning, and only write the blob and append to scan-files.txt
when both checks pass (affecting the commands around git -C target show
"$HEAD_SHA:$file", mkdir -p "scan-target/$(dirname "$file")", and printf '%s\n'
"$file" >> scan-files.txt).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d89e2e81-bb2c-40ed-9a30-c1feed35a179
📒 Files selected for processing (3)
.deepsec/README.md.github/scripts/openai-token-proxy.mjs.github/workflows/deepsec.yml
✅ Files skipped from review due to trivial changes (1)
- .deepsec/README.md
|



Summary (AI generated)
.deepsecworkspace for Capgo with project-specific scan context and priority paths.pull_request_targetwithout executing PR code.OPENAI_API_TOKENsecret through a local OpenAI proxy; no AI Gateway upstream or token is required.deepsec-fork-prenvironment so they can be maintainer-approved/on-demand while still producing the mandatoryScan PR changescheck.Motivation (AI generated)
DeepSec should block net-new security issues in PRs without requiring contributors to remember a manual scan step. Fork PRs are the highest-risk contribution path, so the workflow must create the same required check there while keeping secrets behind an approval gate and a local proxy.
Business Impact (AI generated)
This adds an automated security gate for changes to Capgo's backend, Cloudflare Workers, database, frontend services, and CLI code. It should reduce the chance of shipping auth, storage, update-delivery, or credential-handling regressions.
Test Plan (AI generated)
bun install --frozen-lockfilein.deepsecbun --cwd .deepsec deepsec scan --project-id capgo.github/workflows/deepsec.ymland.github/scripts/openai-token-proxy.mjsreturned zero findingsgit diff --checkgo run github.com/rhysd/actionlint/cmd/actionlint@latest .github/workflows/deepsec.ymlbunx eslint .github/scripts/openai-token-proxy.mjsbun lintbun run cli:build && vue-tsc --noEmitChecklist (AI generated)
Summary by CodeRabbit
New Features
Documentation
Chores