[STG-2419] feat: add playwright-to-stagehand skill#140
Draft
shrey150 wants to merge 2 commits into
Draft
Conversation
…n) to Stagehand v3 on Browserbase Converts Playwright automation scripts to Stagehand v3 (TypeScript) on Browserbase. Stagehand v3's understudy page API is Playwright-flavored but only partially compatible, so the skill frames every step as one of three moves — Port the compatible subset, Rewrite the different-shape constructs (page.click(sel) -> locator(sel).click(), $$eval -> evaluate, getByTestId -> [data-testid], positional setViewportSize), and Upgrade-or-flag the rest (brittle selectors/list scrapes -> act/extract; getByRole/Text/Label -> act; route/waitForResponse/expect/downloads -> needs-human-review). Handles TS/JS and Python sources; flags @playwright/test files as out of scope (Stagehand is not a test runner). - SKILL.md: scope gate, source detection, inventory, port/rewrite/upgrade triage, v3 rewrite, migration summary - references/: api-mapping (full page-API compatibility table verified vs Stagehand 3.6.0), determinism (keep/rewrite/upgrade decision tree), guide, prompt (tool-agnostic), trace-assisted - EXAMPLES.md: before/after pairs (TS + Python, plus the test-file and network-interception gaps) - README row added; passes validate-skills.mjs Validated with a live eval (9 real Playwright scripts converted via skill-only subagents -> tsc -> run on live Browserbase): 9/9 compile, 9/9 run, outcomes match ground truth. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stic, not AI The decision tree routed every read to extract() and showed the password fill via act()+variables — both over-AI-ify deterministic code. Corrected: - Reads: stable-selector scrapes default to a deterministic page.evaluate(...) (zero AI/zero cost); extract() reserved for brittle/variable markup or wanted DOM-drift resilience. ($$eval has no understudy equivalent, but evaluate does.) - Secrets: stable fields fill deterministically via locator(sel).fill(process.env...) — no LLM call, and the secret never enters a prompt; act()+variables only when the field needs AI resolution. - Updated SKILL.md (triage, checklist, mistakes), determinism.md (read decision + failure modes), api-mapping.md (§4.1, §4.4), EXAMPLES.md (#1, #2), prompt.md. Re-verified with skill-only reconversions on the corrected skill: case 01 (scrape) now emits page.evaluate (0 extract), case 03 (login) fills the password deterministically (0 act) — both tsc-clean and run live with correct output. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
playwright-to-stagehand— a skill that migrates Playwright automation scripts (TypeScript/JavaScript or Python) to Stagehand v3 (TypeScript) on Browserbase. It's the Playwright counterpart to the mergedbrowser-use-to-stagehandskill and follows the same structure (leanSKILL.md→references/) and the same live-eval verification discipline.Linear: STG-2419
The core design decision
The migrations pull in opposite directions, and that's the whole point:
Stagehand v3 does not run Playwright — it uses the
understudyCDP engine, whose page API is Playwright-flavored but only partially compatible (verified against@browserbasehq/stagehand3.6.0 source). So a naive transpile is wrong. Every step is one of three moves:page.goto,page.locator(css/xpath).fill/click,evaluate,screenshot,frames,waitForSelector/LoadState).page.click(sel)→page.locator(sel).click()(page-level click is coordinate-based), stable-selector$$eval→page.evaluate(...)(deterministic, zero AI),getByTestId→[data-testid], positionalsetViewportSize,waitForURL→ poll.act/extract/observe; semanticgetByRole/Text/Label→actor CSS;route/waitForResponse/page.on(event)/expect()/downloads/multi-context → needs-human-review.Deterministic-by-default is load-bearing. The skill is explicit that you should not over-AI-ify: a stable-selector scrape stays a
page.evaluate, a stable#idclick stays alocator, and a secret goes into a deterministiclocator("#password").fill(process.env.PASS!)(no LLM call, and the secret never enters a prompt) —act/extractare reserved for genuinely brittle/semantic/variable steps or when you explicitly want DOM-drift resilience. It also gates scope:@playwright/testfiles are out of scope (Stagehand isn't a test runner) — lift only the browser logic, mapexpect()to read-and-throw.What's in it
SKILL.md— scope gate, source detection (TS/JS vs Python; plain vs@playwright/test), inventory, the port/rewrite/upgrade triage, the v3 rewrite, a migration summary.references/api-mapping.md— the full page-API compatibility table (Port / Rewrite / Upgrade-or-flag for every common Playwright call), verified against Stagehand 3.6.0 source, plus the Python→TS cross-language mapping and the gap list.references/determinism.md— the keep/rewrite/upgrade decision tree (reads default to deterministicpage.evaluate) and the failure modes (over-AI-ify, under-migrate, copy-what-doesn't-exist).references/guide.md,references/prompt.md(tool-agnostic),references/trace-assisted.md,EXAMPLES.md(TS + Python before/after, incl. test-file and network-interception gaps),LICENSE.txt.README.mdrow added; passesnode scripts/validate-skills.mjs(18/18, 0 errors/0 warnings).E2E Test Matrix
Verified with a live eval: real Playwright scripts (TS + Python, plain +
@playwright/test, brittle scrapes, login, semantic locators, network interception, screenshot) → converted by skill-only subagents (each loads only the skill, not Stagehand prior knowledge) →tsc --noEmit→ run live on Browserbase → graded vs ground truth.node scripts/validate-skills.mjs18 passed, 0 failed, 0 error(s), 0 warning(s)tsc --noEmiton all converted scripts9/9 tsc OK$$evalscrape (live)scraped 11 books, £ prices exactexpect(live)You logged into a secure area!#idported via locator;expect→throw.getBy*→ CSS locator (live)getBy*to stable CSS + polyfilled missingwaitForURL; no over-AI-ifying.@playwright/testspec (live)expect→read+throw.route+waitForResponsegap (live)route, restructured XHR-sniff.screenshot(live)669151bytes,20booksscreenshot→Buffer, positional viewport.Result: 9/9 compile, 9/9 run live, 9/9 outcomes match ground truth, no skill-attributable failures.
Correction during review (deterministic-by-default)
Review caught an over-AI-ify lean in the first eval pass: the converters sent stable-selector scrapes to
extract()and a stable#passwordtoact(), when a deterministicpage.evaluate/locator.fillis better (free, instant, secret never reaches a prompt). Root cause was the decision tree routing every read toextract. Fixed the skill to default reads/secret-fills to deterministic, then re-verified with fresh skill-only reconversions on the corrected skill:page.evaluate— 0extract, 0act— tsc-clean, correct quotes livelocator("#password").fill(process.env…)— 0act— tsc-clean, "secure area" liveThe eval harness (
skill-only conversion → tsc → live Browserbase → grade) doubles as a drift detector to re-run on each Stagehand/Playwright release.Notes
.changeset/config.json(skills marketplace, not a published package)..claude-plugin/marketplace.json— consistent with the siblingbrowser-use-to-stagehand, which isn't listed there either.🤖 Generated with Claude Code