Skip to content

fix(sandbox): add mechanistic smoke test for L4 deny and document the L4/L7 split#1412

Merged
johntmyers merged 5 commits into
NVIDIA:mainfrom
mesutoezdil:fix/1333-mechanistic-l4-smoke
May 26, 2026
Merged

fix(sandbox): add mechanistic smoke test for L4 deny and document the L4/L7 split#1412
johntmyers merged 5 commits into
NVIDIA:mainfrom
mesutoezdil:fix/1333-mechanistic-l4-smoke

Conversation

@mesutoezdil
Copy link
Copy Markdown
Contributor

@mesutoezdil mesutoezdil commented May 15, 2026

  • Added e2e/policy-advisor/mechanistic-smoke.sh to test the mechanistic mapper with an L4 CONNECT deny
  • Added a Policy Proposals section to architecture/sandbox.md documenting the intentional L4-only scope
  • Wired smoke into mise run e2e:mechanistic-smoke with gateway setup
  • Documented smoke in e2e/policy-advisor/README.md
  • Fixed unbound TMP_DIR reference on early exit (set -u guard)

Related Issue: Refs #1333

Testing

  • bash -n e2e/policy-advisor/mechanistic-smoke.sh passes
  • markdownlint-cli2 architecture/sandbox.md passes with 0 errors
  • mise run e2e:mechanistic-smoke runs the full flow against a Docker gateway

… L4/L7 split

The old smoke script exercised an L7 PUT which hung because the denial
aggregator is only wired to L4 CONNECT denies, not L7 enforcement.

Add mechanistic-smoke.sh which triggers an L4 deny, waits for the
aggregator to flush, and asserts a pending chunk appears under
openshell rule get --status pending.

Document the intentional L4-only scope of the mechanistic mapper in
architecture/sandbox.md.

Fixes NVIDIA#1333

Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…p call

Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
@TaylorMutch
Copy link
Copy Markdown
Collaborator

I tested the new smoke locally on this branch with the Docker-backed e2e wrapper:

e2e/with-docker-gateway.sh bash -lc '
  target/debug/openshell settings set --global \
    --key agent_policy_proposals_enabled \
    --value true \
    --yes
  OPENSHELL_BIN="$PWD/target/debug/openshell" \
    bash e2e/policy-advisor/mechanistic-smoke.sh
'

It passed: the script created a sandbox, triggered the expected L4 CONNECT deny for blocked.invalid:443, and openshell rule get --status pending showed allow_blocked_invalid_443 for /usr/bin/curl.

A few items still need action before this fully resolves #1333:

  1. Please wire e2e/policy-advisor/mechanistic-smoke.sh into an exercised path, or document clearly that it is manual-only. Right now I do not see it referenced from mise run e2e, tasks/test.toml, .github/workflows/e2e-test.yml, or e2e/policy-advisor/README.md, so CI will not catch regressions in this path.
  2. mechanistic mapper: L4-only by design; retarget e2e smoke and document the split #1333 explicitly asks to “Verify (or add) a positive unit test that an L4 deny enqueues a DenialEvent.” I do not see that covered in this PR. Please add the unit test, or change Fixes #1333 to Refs #1333 and explain the remaining work.
  3. Please initialize TMP_DIR="" before the trap cleanup EXIT. With set -u, early preflight failures can make cleanup() reference an unbound TMP_DIR and mask the real error.

The L4 retarget itself looks correct based on the local run; the main gap is making sure this becomes durable regression coverage and that all acceptance items from #1333 are addressed.

@TaylorMutch TaylorMutch self-assigned this May 18, 2026
- Initialize TMP_DIR before trap to prevent unbound variable on early exit
- Add e2e:mechanistic-smoke mise task with gateway setup
- Document mechanistic smoke in policy-advisor README
Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
@mesutoezdil mesutoezdil force-pushed the fix/1333-mechanistic-l4-smoke branch from 188cbf8 to 2bcc30e Compare May 23, 2026 16:46
@mesutoezdil
Copy link
Copy Markdown
Contributor Author

mesutoezdil commented May 23, 2026

Addressed all 3 items: wired the smoke into mise, guarded TMP_DIR, and added a unit test in proxy::tests::test_emit_denial_enqueues_denial_event that verifies an L4 deny enqueues a DenialEvent with the correct fields.

@johntmyers
Copy link
Copy Markdown
Collaborator

/ok to test 2bcc30e

@johntmyers
Copy link
Copy Markdown
Collaborator

@mesutoezdil please address the format issues on the failed checks. Please ensure these are tested before pushing.

@mesutoezdil
Copy link
Copy Markdown
Contributor Author

mesutoezdil commented May 26, 2026

@mesutoezdil please address the format issues on the failed checks. Please ensure these are tested before pushing.

I remember testing this, but for some reason it’s causing problems.

the clippy lint (unused-qualifications) and rustfmt issues in the test function are resolved in the latest commit.

verified locally before pushing:

cargo test -p openshell-sandbox proxy::tests::test_emit_denial_enqueues_denial_event
test proxy::tests::test_emit_denial_enqueues_denial_event ... ok

mise run pre-commit
Finished in 66.88s (all checks green)

Screenshot 2026-05-26 at 22 07 34 Screenshot 2026-05-26 at 22 07 43

@johntmyers
Copy link
Copy Markdown
Collaborator

/ok to test 18ee550

@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label May 26, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e applied for 18ee550. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

@johntmyers johntmyers merged commit 7174983 into NVIDIA:main May 26, 2026
37 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants