diff --git a/AGENTS.md b/AGENTS.md index 2f7cff2..2ccd167 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -37,12 +37,14 @@ mcp-selenium/ ├── browser.test.mjs ← start_browser, close_session, take_screenshot, multi-session ├── navigation.test.mjs ← navigate, all 6 locator strategies ├── interactions.test.mjs ← click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file + ├── bidi.test.mjs ← BiDi enablement, console/error/network capture, session isolation └── fixtures/ ← HTML files loaded via file:// URLs ├── locators.html ├── interactions.html ├── mouse-actions.html ├── drag-drop.html - └── upload.html + ├── upload.html + └── bidi.html ``` ### Key Files in Detail @@ -82,13 +84,15 @@ All browser state is held in a module-level `state` object: ```js const state = { drivers: new Map(), // sessionId → WebDriver instance - currentSession: null // string | null — the active session ID + currentSession: null, // string | null — the active session ID + bidi: new Map() // sessionId → { available, consoleLogs, pageErrors, networkLogs } }; ``` - **Session IDs** are formatted as `{browser}_{Date.now()}` (e.g., `chrome_1708531200000`) - Only one session is "current" at a time (set by `start_browser`, cleared by `close_session`) - Multiple sessions can exist in the `drivers` Map, but tools always operate on `currentSession` +- **BiDi state** is a single Map of per-session objects — cleanup is one `state.bidi.delete(sessionId)` call ### Helper Functions @@ -96,6 +100,21 @@ const state = { |----------|---------| | `getDriver()` | Returns the WebDriver for `state.currentSession`. Throws if no active session. | | `getLocator(by, value)` | Converts a locator strategy string (`"id"`, `"css"`, `"xpath"`, `"name"`, `"tag"`, `"class"`) to a Selenium `By` object. | +| `newBidiState()` | Returns a fresh `{ available, consoleLogs, pageErrors, networkLogs }` object for a new session. | +| `setupBidi(driver, sessionId)` | Wires up BiDi event listeners (console, JS errors, network) for a session. Called from `start_browser`. | +| `registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage)` | Factory that registers a diagnostic tool. All three BiDi tools (`get_console_logs`, `get_page_errors`, `get_network_logs`) use this — don't copy-paste a new handler, call this instead. | + +### Diagnostics (WebDriver BiDi) + +The server automatically enables [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/) when starting a browser session. BiDi provides real-time, passive capture of browser diagnostics — console messages, JavaScript errors, and network activity are collected in the background without any extra configuration. + +This is especially useful for AI agents: when something goes wrong on a page, the agent can check `get_console_logs` and `get_page_errors` to understand *why*, rather than relying solely on screenshots. + +- **Automatic**: BiDi is enabled by default when the browser supports it +- **Graceful fallback**: If the browser or driver doesn't support BiDi, the session starts normally and the diagnostic tools return a helpful message +- **No performance impact**: Logs are passively captured via event listeners — no polling or extra requests +- **Per-session**: Each browser session has its own log buffers, cleaned up automatically on session close +- **BiDi modules are dynamically imported** at the top of `server.js` — if the selenium-webdriver version doesn't include them, `LogInspector` and `Network` are set to `null` and all BiDi code is skipped ### Cleanup @@ -232,6 +251,7 @@ Tests talk to the real MCP server over stdio using JSON-RPC 2.0. No mocking. | `browser.test.mjs` | start_browser, close_session, take_screenshot, multi-session | | `navigation.test.mjs` | navigate, all 6 locator strategies (id, css, xpath, name, tag, class) | | `interactions.test.mjs` | click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file | +| `bidi.test.mjs` | BiDi enablement, console log capture, page error capture, network log capture, session isolation | ### When Adding a New Tool diff --git a/README.md b/README.md index 2871957..67da24d 100644 --- a/README.md +++ b/README.md @@ -23,6 +23,10 @@ A Model Context Protocol (MCP) server implementation for Selenium WebDriver, ena - Upload files - Support for headless mode - Manage browser cookies (add, get, delete) +- **Real-time diagnostics** via WebDriver BiDi: + - Console log capture (info, warn, error) + - JavaScript error detection with stack traces + - Network request monitoring (successes and failures) ## Supported Browsers @@ -791,6 +795,54 @@ Deletes cookies from the current browser session. Deletes a specific cookie by n } ``` +### get_console_logs +Retrieves captured browser console messages (log, warn, error, etc.). Console logs are automatically captured in the background via WebDriver BiDi when the browser supports it — no configuration needed. + +**Parameters:** +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| clear | boolean | No | Clear the captured logs after retrieving them (default: false) | + +**Example:** +```json +{ + "tool": "get_console_logs", + "parameters": {} +} +``` + +### get_page_errors +Retrieves captured JavaScript errors and uncaught exceptions with full stack traces. Errors are automatically captured in the background via WebDriver BiDi. + +**Parameters:** +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| clear | boolean | No | Clear the captured errors after retrieving them (default: false) | + +**Example:** +```json +{ + "tool": "get_page_errors", + "parameters": {} +} +``` + +### get_network_logs +Retrieves captured network activity including successful responses and failed requests. Network logs are automatically captured in the background via WebDriver BiDi. + +**Parameters:** +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| clear | boolean | No | Clear the captured logs after retrieving them (default: false) | + +**Example:** +```json +{ + "tool": "get_network_logs", + "parameters": {} +} +``` + ## License MIT diff --git a/src/lib/server.js b/src/lib/server.js index 44a782d..4f3c04c 100755 --- a/src/lib/server.js +++ b/src/lib/server.js @@ -10,6 +10,18 @@ import { Options as FirefoxOptions } from 'selenium-webdriver/firefox.js'; import { Options as EdgeOptions } from 'selenium-webdriver/edge.js'; import { Options as SafariOptions } from 'selenium-webdriver/safari.js'; +// BiDi imports — loaded dynamically to avoid hard failures if not available +let LogInspector, Network; +try { + LogInspector = (await import('selenium-webdriver/bidi/logInspector.js')).default; + const networkModule = await import('selenium-webdriver/bidi/network.js'); + Network = networkModule.Network; +} catch (_) { + // BiDi modules not available in this selenium-webdriver version + LogInspector = null; + Network = null; +} + // Create an MCP server const server = new McpServer({ @@ -20,7 +32,8 @@ const server = new McpServer({ // Server state const state = { drivers: new Map(), - currentSession: null + currentSession: null, + bidi: new Map() }; // Helper functions @@ -44,6 +57,80 @@ const getLocator = (by, value) => { } }; +// BiDi helpers +const newBidiState = () => ({ + available: false, + consoleLogs: [], + pageErrors: [], + networkLogs: [] +}); + +async function setupBidi(driver, sessionId) { + const bidi = newBidiState(); + + const logInspector = await LogInspector(driver); + await logInspector.onConsoleEntry((entry) => { + try { + bidi.consoleLogs.push({ + level: entry.level, text: entry.text, timestamp: entry.timestamp, + type: entry.type, method: entry.method, args: entry.args + }); + } catch (_) { /* ignore malformed entry */ } + }); + await logInspector.onJavascriptLog((entry) => { + try { + bidi.pageErrors.push({ + level: entry.level, text: entry.text, timestamp: entry.timestamp, + type: entry.type, stackTrace: entry.stackTrace + }); + } catch (_) { /* ignore malformed entry */ } + }); + + const network = await Network(driver); + await network.responseCompleted((event) => { + try { + bidi.networkLogs.push({ + type: 'response', url: event.request?.url, status: event.response?.status, + method: event.request?.method, mimeType: event.response?.mimeType, timestamp: Date.now() + }); + } catch (_) { /* ignore malformed event */ } + }); + await network.fetchError((event) => { + try { + bidi.networkLogs.push({ + type: 'error', url: event.request?.url, method: event.request?.method, + errorText: event.errorText, timestamp: Date.now() + }); + } catch (_) { /* ignore malformed event */ } + }); + + bidi.available = true; + state.bidi.set(sessionId, bidi); +} + +function registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage) { + server.tool( + name, + description, + { clear: z.boolean().optional().describe("Clear after returning (default: false)") }, + async ({ clear = false }) => { + try { + getDriver(); + const bidi = state.bidi.get(state.currentSession); + if (!bidi?.available) { + return { content: [{ type: 'text', text: unavailableMessage }] }; + } + const logs = bidi[logKey]; + const result = logs.length === 0 ? emptyMessage : JSON.stringify(logs, null, 2); + if (clear) bidi[logKey] = []; + return { content: [{ type: 'text', text: result }] }; + } catch (e) { + return { content: [{ type: 'text', text: `Error: ${e.message}` }], isError: true }; + } + } + ); +} + // Common schemas const browserOptionsSchema = z.object({ headless: z.boolean().optional().describe("Run browser in headless mode"), @@ -69,6 +156,14 @@ server.tool( let builder = new Builder(); let driver; let warnings = []; + + // Enable BiDi websocket if the modules are available + if (LogInspector && Network) { + // 'ignore' prevents BiDi from auto-dismissing alert/confirm/prompt dialogs, + // allowing accept_alert, dismiss_alert, and get_alert_text to work as expected. + builder = builder.withCapabilities({ 'webSocketUrl': true, 'unhandledPromptBehavior': 'ignore' }); + } + switch (browser) { case 'chrome': { const chromeOptions = new ChromeOptions(); @@ -134,7 +229,19 @@ server.tool( state.drivers.set(sessionId, driver); state.currentSession = sessionId; + // Attempt to enable BiDi for real-time log capture + if (LogInspector && Network) { + try { + await setupBidi(driver, sessionId); + } catch (_) { + // BiDi not supported by this browser/driver — continue without it + } + } + let message = `Browser started with session_id: ${sessionId}`; + if (state.bidi.get(sessionId)?.available) { + message += ' (BiDi enabled: console logs, JS errors, and network activity are being captured)'; + } if (warnings.length > 0) { message += `\nWarnings: ${warnings.join(' ')}`; } @@ -473,10 +580,14 @@ server.tool( async () => { try { const driver = getDriver(); - await driver.quit(); - state.drivers.delete(state.currentSession); const sessionId = state.currentSession; - state.currentSession = null; + try { + await driver.quit(); + } finally { + state.drivers.delete(sessionId); + state.bidi.delete(sessionId); + state.currentSession = null; + } return { content: [{ type: 'text', text: `Browser session ${sessionId} closed` }] }; @@ -681,6 +792,7 @@ server.tool( console.error(`Error quitting driver for session ${sessionId}:`, quitError); } state.drivers.delete(sessionId); + state.bidi.delete(sessionId); state.currentSession = null; return { content: [{ type: 'text', text: 'Last window closed. Session ended.' }] @@ -957,6 +1069,31 @@ server.tool( } ); +// BiDi Diagnostic Tools +registerBidiTool( + 'get_console_logs', + 'returns browser console messages (log, warn, info, debug) captured via WebDriver BiDi. Useful for debugging page behavior, seeing application output, and catching warnings.', + 'consoleLogs', + 'No console logs captured', + 'Console log capture is not available (BiDi not supported by this browser/driver)' +); + +registerBidiTool( + 'get_page_errors', + 'returns JavaScript errors and exceptions captured via WebDriver BiDi. Includes stack traces when available. Essential for diagnosing why a page is broken or a feature isn\'t working.', + 'pageErrors', + 'No page errors captured', + 'Page error capture is not available (BiDi not supported by this browser/driver)' +); + +registerBidiTool( + 'get_network_logs', + 'returns network activity (completed responses and failed requests) captured via WebDriver BiDi. Shows HTTP status codes, URLs, methods, and error details. Useful for diagnosing failed API calls and broken resources.', + 'networkLogs', + 'No network activity captured', + 'Network log capture is not available (BiDi not supported by this browser/driver)' +); + // Resources server.resource( "browser-status", @@ -986,6 +1123,7 @@ async function cleanup() { } } state.drivers.clear(); + state.bidi.clear(); state.currentSession = null; process.exit(0); } diff --git a/test/bidi.test.mjs b/test/bidi.test.mjs new file mode 100644 index 0000000..778f1dd --- /dev/null +++ b/test/bidi.test.mjs @@ -0,0 +1,170 @@ +import { describe, it, after, before } from 'node:test'; +import assert from 'node:assert/strict'; +import { McpClient, getResponseText, fixture } from './mcp-client.mjs'; + +describe('BiDi Diagnostic Tools', () => { + let client; + + before(async () => { + client = new McpClient(); + await client.start(); + }); + + after(async () => { + await client.stop(); + }); + + describe('BiDi Enablement', () => { + after(async () => { + try { await client.callTool('close_session', {}); } catch (_) {} + }); + + it('should enable BiDi automatically when starting browser', async () => { + const result = await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + const text = getResponseText(result); + assert.ok(text.includes('BiDi enabled'), `Expected BiDi enabled message, got: ${text}`); + }); + }); + + describe('Console Log Capture', () => { + before(async () => { + await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + await client.callTool('navigate', { url: fixture('bidi.html') }); + }); + + after(async () => { + try { await client.callTool('close_session', {}); } catch (_) {} + }); + + it('should capture console messages at different levels', async () => { + await client.callTool('get_console_logs', { clear: true }); + + await client.callTool('click_element', { by: 'id', value: 'log-info' }); + await client.callTool('click_element', { by: 'id', value: 'log-warn' }); + await client.callTool('click_element', { by: 'id', value: 'log-error' }); + await new Promise(r => setTimeout(r, 500)); + + const result = await client.callTool('get_console_logs', {}); + assert.ok(!result.isError, `Tool returned error: ${getResponseText(result)}`); + const logs = JSON.parse(getResponseText(result)); + + assert.ok(logs.find(l => l.text?.includes('Hello from console')), 'Should capture console.log'); + const warnLog = logs.find(l => l.text?.includes('This is a warning')); + assert.ok(warnLog, 'Should capture console.warn'); + assert.ok(warnLog.level === 'warn' || warnLog.level === 'warning', `Expected warn level, got: ${warnLog.level}`); + const errorLog = logs.find(l => l.text?.includes('This is a console error')); + assert.ok(errorLog, 'Should capture console.error'); + assert.strictEqual(errorLog.level, 'error'); + }); + + it('should clear logs when clear=true and return empty on next read', async () => { + await client.callTool('execute_script', { script: 'console.log("clear-test");' }); + await new Promise(r => setTimeout(r, 500)); + + const clearResult = await client.callTool('get_console_logs', { clear: true }); + assert.ok(getResponseText(clearResult).includes('clear-test'), 'Should return logs before clearing'); + + const afterResult = await client.callTool('get_console_logs', {}); + assert.strictEqual(getResponseText(afterResult), 'No console logs captured'); + }); + }); + + describe('Page Error Capture', () => { + before(async () => { + await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + await client.callTool('navigate', { url: fixture('bidi.html') }); + }); + + after(async () => { + try { await client.callTool('close_session', {}); } catch (_) {} + }); + + it('should capture JavaScript errors with stack traces', async () => { + await client.callTool('get_page_errors', { clear: true }); + await client.callTool('execute_script', { + script: 'setTimeout(() => { throw new Error("Intentional test error"); }, 0);' + }); + await new Promise(r => setTimeout(r, 1000)); + const result = await client.callTool('get_page_errors', {}); + assert.ok(!result.isError, `Tool returned error: ${getResponseText(result)}`); + const text = getResponseText(result); + const errors = JSON.parse(text); + const jsError = errors.find(e => e.text?.includes('Intentional test error')); + assert.ok(jsError, `Expected JS error with 'Intentional test error', got: ${text}`); + assert.strictEqual(jsError.type, 'javascript'); + assert.ok(jsError.stackTrace, 'Should include stack trace'); + }); + }); + + describe('Network Log Capture', () => { + before(async () => { + await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + }); + + after(async () => { + try { await client.callTool('close_session', {}); } catch (_) {} + }); + + it('should capture successful and failed network requests', async () => { + await client.callTool('get_network_logs', { clear: true }); + await client.callTool('navigate', { url: fixture('bidi.html') }); + await client.callTool('execute_script', { + script: 'fetch("http://localhost:1/nonexistent").catch(() => {});' + }); + await new Promise(r => setTimeout(r, 1000)); + + const result = await client.callTool('get_network_logs', {}); + assert.ok(!result.isError, `Tool returned error: ${getResponseText(result)}`); + const logs = JSON.parse(getResponseText(result)); + + const pageLoad = logs.find(l => l.url?.includes('bidi.html')); + assert.ok(pageLoad, 'Should capture page navigation'); + assert.strictEqual(pageLoad.method, 'GET'); + + const failedRequest = logs.find(l => l.type === 'error'); + assert.ok(failedRequest, 'Should capture failed network request'); + }); + }); + + describe('Session Isolation', () => { + after(async () => { + try { await client.callTool('close_session', {}); } catch (_) {} + }); + + it('should reset BiDi logs when starting a new session', async () => { + await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + await client.callTool('navigate', { url: fixture('bidi.html') }); + await client.callTool('execute_script', { script: 'console.log("session-1-log");' }); + await new Promise(r => setTimeout(r, 500)); + + const firstLogs = await client.callTool('get_console_logs', {}); + assert.ok(getResponseText(firstLogs).includes('session-1-log')); + + await client.callTool('close_session', {}); + await client.callTool('start_browser', { + browser: 'chrome', + options: { headless: true, arguments: ['--no-sandbox'] } + }); + + const newLogs = await client.callTool('get_console_logs', {}); + assert.strictEqual(getResponseText(newLogs), 'No console logs captured'); + + await client.callTool('close_session', {}); + }); + }); +}); diff --git a/test/fixtures/bidi.html b/test/fixtures/bidi.html new file mode 100644 index 0000000..6164f1d --- /dev/null +++ b/test/fixtures/bidi.html @@ -0,0 +1,14 @@ + + +
+