docs(cua-driver): "Process attribution" explainer (Session 0 on Windows + TCC on macOS)#1594
docs(cua-driver): "Process attribution" explainer (Session 0 on Windows + TCC on macOS)#1594f-trycua wants to merge 1 commit into
Conversation
…ws + TCC on macOS)
Adds a Diataxis-style explanation page that disambiguates the two OS
subsystems that decide what a cua-driver process can touch: Windows
session attribution (Session 0 vs 1+) and macOS TCC ((bundle id,
cdhash) grants). They look identical from the user's POV — "my tool
calls return empty arrays" — but the mechanics are completely
different, and the howtos in `windows-ssh.mdx` and `installation.mdx`
deliberately don't go this deep.
* New section under `docs/content/docs/cua-driver/explanation/`,
registered in the top-level cua-driver sidebar
(`meta.json` adds "explanation" alongside "guide" and "reference").
* New page `explanation/process-attribution.mdx` (~130 lines, ~700
words MDX prose + two diagrams):
- Opening framing: tool effect depends on the responsible process,
not the CLI.
- Windows session attribution: `query session` output, why SSH is
Session 0, why desktop APIs are session-scoped, the daemon-proxy
fix from PR #1580.
- macOS TCC: (bundle id, cdhash) keying, why a `cargo build`
binary misses, the dev-build re-sign workaround, the
responsible-process attribution problem and how
`should_use_daemon_proxy` solves it via `open -n -g -a CuaDriver`.
- What `cua-driver doctor` reports about both subsystems.
- Symptoms-to-cause lookup table.
* Cross-links: `windows-ssh.mdx` Session 0 paragraph and
`installation.mdx` "Grant TCC permissions" intro both link out to
the new explainer for the deeper mechanics. `cli-reference.mdx`
`cua-driver doctor` section also points there.
The page is grounded in `should_use_daemon_proxy` in
`libs/cua-driver-rs/crates/cua-driver/src/cli.rs` (the v0.2.7
cross-platform proxy router from PR #1580) and `doctor.rs` for the
exact warning text, so the docs stay synced with what the CLI actually
prints.
Closes CUA-537
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR adds a comprehensive "Explanation" documentation section about process attribution that clarifies how Windows session scoping and macOS TCC permissions affect cua-driver tool visibility, then integrates this explanation into existing guides through targeted cross-links. ChangesProcess Attribution Explanation and Integration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…oses CUA-541) (#1596) * docs(cua-driver): dedicated Linux install + run guide Adds a Linux-specific getting-started page that covers what diverges from the canonical macOS / Windows install — display server (X11 vs Wayland vs XWayland), the AT-SPI prerequisite, autostart-via-systemd (since `cua-driver autostart` is Windows-only today), and the headless Xvfb workflow for CI. Concrete behaviors documented (verified against the source): * Display-server probe in `libs/cua-driver-rs/crates/cua-driver/src/ doctor.rs::append_platform_probes` (Linux branch) — Wayland-with- XWayland is the supported case; pure Wayland w/o XWayland is not today. * AT-SPI bus probe via `probe_at_spi_bus_via_gdbus` — required for accessibility-tree tools (`get_window_state`, indexed-element clicks). Per-distro install table (`at-spi2-core` on every distro, `gsettings` toggle for GNOME). * `cua-driver autostart` returns the documented `NOT_YET` error on Linux from `crates/cua-driver/src/autostart.rs` — the page points users at the two working alternatives: `install-local.sh --autostart` (which registers a systemd user unit) or a hand-rolled `~/.config/systemd/user/cua-driver.service` (full file content provided). Also mentions `loginctl enable-linger` for headless runners. * Headless workflow: `xvfb-run` recipe + standalone Xvfb + dbus-launch one-liner. * Distro notes for Ubuntu/Debian/Fedora/Arch/Alpine. * Cross-links to the existing PARITY.md per-tool Linux matrix, the new process-attribution explainer (#1594), the autostart concept page, and installation.mdx for the install one-liner. Registers `linux` in `getting-started/meta.json` between `windows-ssh` and `autostart` so the sidebar ordering is: install → quickstart → windows-ssh → linux → autostart → … . Closes CUA-538 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cua-driver-rs): add kill_app tool for force-terminate by pid Closes CUA-541. Adds a new `kill_app` MCP tool that force-terminates a process by pid — equivalent to `taskkill /F /PID <pid>` on Windows, `kill -9 <pid>` on macOS / Linux. Marked `destructive: true` so MCP clients with permission gating prompt before invoking. Use case from WINDOWS_CLAUDE_CODE_TEST_PLAN.md Test G live run today: the cleanup step asked claude to close 3 leftover Calculator instances via cua-driver tools, but the only available close path is `click` on the X button (sends WM_CLOSE via PostMessage). UWP / WinUI3 apps (Calculator, Photos, Settings, Win11 Notepad) treat WM_CLOSE as "minimize to suspended state" rather than exit, leaving orphan processes that survive every polite eviction attempt. `kill_app` is the escalation: skip cooperative close, terminate the process directly. Unsaved state is lost — tool docs explicitly recommend trying the click-the-X path first. Platform implementations: * Windows (`crates/platform-windows/src/tools/impl_.rs`): `OpenProcess(PROCESS_TERMINATE | PROCESS_SYNCHRONIZE)` + `TerminateProcess(h, 1)` + `WaitForSingleObject(h, 2000)`. The extra `PROCESS_SYNCHRONIZE` right + 2-second wait means the caller's follow-up `list_apps` reflects the change immediately; `WAIT_TIMEOUT` is treated as a soft success since the kill itself landed. Run on a `spawn_blocking` thread because Win32 `WaitForSingleObject` is sync. * macOS (`crates/platform-macos/src/tools/kill_app.rs`, new file): `libc::kill(pid, SIGKILL)`. Synchronous; no spawn_blocking needed. Returns `EPERM` if the daemon can't signal the target (cross-user or root-owned), `ESRCH` if the pid doesn't exist — both surfaced via `std::io::Error::last_os_error()`. * Linux (`crates/platform-linux/src/tools/impl_.rs`): same `libc::kill(pid, SIGKILL)` as macOS. Adds `libc = "0.2"` to platform-linux/Cargo.toml since the crate didn't depend on libc directly before. * Stubs (`stubs.rs` in both platform crates): register `KillAppTool` for the cross-platform interface — non-target builds get the standard "not implemented for this platform" error. Schema: single `pid: integer` argument. Bounds-checked at the tool boundary (must be positive, must fit in i32/u32 depending on the platform call signature). Live tested locally on macOS — `cua-driver call kill_app '{"pid":<calc-pid>}'` terminates the target cleanly. Windows verification pending in the follow-up VM build + reinstall step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cua-driver-rs/kill_app): replace ambiguous .into() with .to_string() Caught only on Windows since the windows-rs / bytes crate ecosystem exposes more From<&str> impls than macOS / Linux. The macOS path hit the same E0283 earlier in this branch; this lines up the other two impls with the fix. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a Diataxis-style explanation page that disambiguates the two OS subsystems that decide what a cua-driver process can touch:
They look identical from the user's POV — "my tool calls return empty arrays" — but the mechanics are completely different, and the recipes in `windows-ssh.mdx` and `installation.mdx` deliberately don't go this deep.
What's in it
Cross-links wired
Grounded in code
Test plan
Closes CUA-537
🤖 Generated with Claude Code
Summary by CodeRabbit