Skip to content
Merged
24 changes: 22 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,14 @@ mcp-selenium/
├── browser.test.mjs ← start_browser, close_session, take_screenshot, multi-session
├── navigation.test.mjs ← navigate, all 6 locator strategies
├── interactions.test.mjs ← click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file
├── bidi.test.mjs ← BiDi enablement, console/error/network capture, session isolation
└── fixtures/ ← HTML files loaded via file:// URLs
├── locators.html
├── interactions.html
├── mouse-actions.html
├── drag-drop.html
└── upload.html
├── upload.html
└── bidi.html
```

### Key Files in Detail
Expand Down Expand Up @@ -82,20 +84,37 @@ All browser state is held in a module-level `state` object:
```js
const state = {
drivers: new Map(), // sessionId → WebDriver instance
currentSession: null // string | null — the active session ID
currentSession: null, // string | null — the active session ID
bidi: new Map() // sessionId → { available, consoleLogs, pageErrors, networkLogs }
};
```

- **Session IDs** are formatted as `{browser}_{Date.now()}` (e.g., `chrome_1708531200000`)
- Only one session is "current" at a time (set by `start_browser`, cleared by `close_session`)
- Multiple sessions can exist in the `drivers` Map, but tools always operate on `currentSession`
- **BiDi state** is a single Map of per-session objects — cleanup is one `state.bidi.delete(sessionId)` call

### Helper Functions

| Function | Purpose |
|----------|---------|
| `getDriver()` | Returns the WebDriver for `state.currentSession`. Throws if no active session. |
| `getLocator(by, value)` | Converts a locator strategy string (`"id"`, `"css"`, `"xpath"`, `"name"`, `"tag"`, `"class"`) to a Selenium `By` object. |
| `newBidiState()` | Returns a fresh `{ available, consoleLogs, pageErrors, networkLogs }` object for a new session. |
| `setupBidi(driver, sessionId)` | Wires up BiDi event listeners (console, JS errors, network) for a session. Called from `start_browser`. |
| `registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage)` | Factory that registers a diagnostic tool. All three BiDi tools (`get_console_logs`, `get_page_errors`, `get_network_logs`) use this — don't copy-paste a new handler, call this instead. |

### Diagnostics (WebDriver BiDi)

The server automatically enables [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/) when starting a browser session. BiDi provides real-time, passive capture of browser diagnostics — console messages, JavaScript errors, and network activity are collected in the background without any extra configuration.

This is especially useful for AI agents: when something goes wrong on a page, the agent can check `get_console_logs` and `get_page_errors` to understand *why*, rather than relying solely on screenshots.

- **Automatic**: BiDi is enabled by default when the browser supports it
- **Graceful fallback**: If the browser or driver doesn't support BiDi, the session starts normally and the diagnostic tools return a helpful message
- **No performance impact**: Logs are passively captured via event listeners — no polling or extra requests
- **Per-session**: Each browser session has its own log buffers, cleaned up automatically on session close
- **BiDi modules are dynamically imported** at the top of `server.js` — if the selenium-webdriver version doesn't include them, `LogInspector` and `Network` are set to `null` and all BiDi code is skipped

### Cleanup

Expand Down Expand Up @@ -232,6 +251,7 @@ Tests talk to the real MCP server over stdio using JSON-RPC 2.0. No mocking.
| `browser.test.mjs` | start_browser, close_session, take_screenshot, multi-session |
| `navigation.test.mjs` | navigate, all 6 locator strategies (id, css, xpath, name, tag, class) |
| `interactions.test.mjs` | click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file |
| `bidi.test.mjs` | BiDi enablement, console log capture, page error capture, network log capture, session isolation |

### When Adding a New Tool

Expand Down
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ A Model Context Protocol (MCP) server implementation for Selenium WebDriver, ena
- Upload files
- Support for headless mode
- Manage browser cookies (add, get, delete)
- **Real-time diagnostics** via WebDriver BiDi:
- Console log capture (info, warn, error)
- JavaScript error detection with stack traces
- Network request monitoring (successes and failures)

## Supported Browsers

Expand Down Expand Up @@ -791,6 +795,54 @@ Deletes cookies from the current browser session. Deletes a specific cookie by n
}
```

### get_console_logs
Retrieves captured browser console messages (log, warn, error, etc.). Console logs are automatically captured in the background via WebDriver BiDi when the browser supports it — no configuration needed.

**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| clear | boolean | No | Clear the captured logs after retrieving them (default: false) |

**Example:**
```json
{
"tool": "get_console_logs",
"parameters": {}
}
```

### get_page_errors
Retrieves captured JavaScript errors and uncaught exceptions with full stack traces. Errors are automatically captured in the background via WebDriver BiDi.

**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| clear | boolean | No | Clear the captured errors after retrieving them (default: false) |

**Example:**
```json
{
"tool": "get_page_errors",
"parameters": {}
}
```

### get_network_logs
Retrieves captured network activity including successful responses and failed requests. Network logs are automatically captured in the background via WebDriver BiDi.

**Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| clear | boolean | No | Clear the captured logs after retrieving them (default: false) |

**Example:**
```json
{
"tool": "get_network_logs",
"parameters": {}
}
```

## License

MIT
139 changes: 136 additions & 3 deletions src/lib/server.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@ import { Options as FirefoxOptions } from 'selenium-webdriver/firefox.js';
import { Options as EdgeOptions } from 'selenium-webdriver/edge.js';
import { Options as SafariOptions } from 'selenium-webdriver/safari.js';

// BiDi imports — loaded dynamically to avoid hard failures if not available
let LogInspector, Network;
try {
LogInspector = (await import('selenium-webdriver/bidi/logInspector.js')).default;
const networkModule = await import('selenium-webdriver/bidi/network.js');
Network = networkModule.Network;
} catch (_) {
// BiDi modules not available in this selenium-webdriver version
LogInspector = null;
Network = null;
}
Comment thread
angiejones marked this conversation as resolved.


// Create an MCP server
const server = new McpServer({
Expand All @@ -20,7 +32,8 @@ const server = new McpServer({
// Server state
const state = {
drivers: new Map(),
currentSession: null
currentSession: null,
bidi: new Map()
};

// Helper functions
Expand All @@ -44,6 +57,80 @@ const getLocator = (by, value) => {
}
};

// BiDi helpers
const newBidiState = () => ({
available: false,
consoleLogs: [],
pageErrors: [],
networkLogs: []
});

async function setupBidi(driver, sessionId) {
const bidi = newBidiState();

const logInspector = await LogInspector(driver);
await logInspector.onConsoleEntry((entry) => {
try {
bidi.consoleLogs.push({
level: entry.level, text: entry.text, timestamp: entry.timestamp,
type: entry.type, method: entry.method, args: entry.args
});
} catch (_) { /* ignore malformed entry */ }
});
await logInspector.onJavascriptLog((entry) => {
try {
bidi.pageErrors.push({
level: entry.level, text: entry.text, timestamp: entry.timestamp,
type: entry.type, stackTrace: entry.stackTrace
});
} catch (_) { /* ignore malformed entry */ }
});

const network = await Network(driver);
await network.responseCompleted((event) => {
try {
bidi.networkLogs.push({
type: 'response', url: event.request?.url, status: event.response?.status,
method: event.request?.method, mimeType: event.response?.mimeType, timestamp: Date.now()
});
} catch (_) { /* ignore malformed event */ }
});
await network.fetchError((event) => {
try {
bidi.networkLogs.push({
type: 'error', url: event.request?.url, method: event.request?.method,
errorText: event.errorText, timestamp: Date.now()
});
} catch (_) { /* ignore malformed event */ }
});

bidi.available = true;
state.bidi.set(sessionId, bidi);
}

function registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage) {
server.tool(
name,
description,
{ clear: z.boolean().optional().describe("Clear after returning (default: false)") },
async ({ clear = false }) => {
try {
getDriver();
const bidi = state.bidi.get(state.currentSession);
if (!bidi?.available) {
return { content: [{ type: 'text', text: unavailableMessage }] };
}
const logs = bidi[logKey];
const result = logs.length === 0 ? emptyMessage : JSON.stringify(logs, null, 2);
if (clear) bidi[logKey] = [];
return { content: [{ type: 'text', text: result }] };
} catch (e) {
return { content: [{ type: 'text', text: `Error: ${e.message}` }], isError: true };
}
}
);
}

// Common schemas
const browserOptionsSchema = z.object({
headless: z.boolean().optional().describe("Run browser in headless mode"),
Expand All @@ -69,6 +156,12 @@ server.tool(
let builder = new Builder();
let driver;
let warnings = [];

// Enable BiDi websocket if the modules are available
if (LogInspector && Network) {
builder = builder.withCapabilities({ 'webSocketUrl': true, 'unhandledPromptBehavior': 'ignore' });
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

switch (browser) {
case 'chrome': {
const chromeOptions = new ChromeOptions();
Expand Down Expand Up @@ -134,7 +227,19 @@ server.tool(
state.drivers.set(sessionId, driver);
state.currentSession = sessionId;

// Attempt to enable BiDi for real-time log capture
if (LogInspector && Network) {
try {
await setupBidi(driver, sessionId);
} catch (_) {
// BiDi not supported by this browser/driver — continue without it
}
}

let message = `Browser started with session_id: ${sessionId}`;
if (state.bidi.get(sessionId)?.available) {
message += ' (BiDi enabled: console logs, JS errors, and network activity are being captured)';
}
if (warnings.length > 0) {
message += `\nWarnings: ${warnings.join(' ')}`;
}
Expand Down Expand Up @@ -473,9 +578,10 @@ server.tool(
async () => {
try {
const driver = getDriver();
await driver.quit();
state.drivers.delete(state.currentSession);
const sessionId = state.currentSession;
await driver.quit();
state.drivers.delete(sessionId);
state.bidi.delete(sessionId);
state.currentSession = null;
return {
content: [{ type: 'text', text: `Browser session ${sessionId} closed` }]
Expand Down Expand Up @@ -681,6 +787,7 @@ server.tool(
console.error(`Error quitting driver for session ${sessionId}:`, quitError);
}
state.drivers.delete(sessionId);
state.bidi.delete(sessionId);
state.currentSession = null;
return {
content: [{ type: 'text', text: 'Last window closed. Session ended.' }]
Expand Down Expand Up @@ -957,6 +1064,31 @@ server.tool(
}
);

// BiDi Diagnostic Tools
registerBidiTool(
'get_console_logs',
'returns browser console messages (log, warn, info, debug) captured via WebDriver BiDi. Useful for debugging page behavior, seeing application output, and catching warnings.',
'consoleLogs',
'No console logs captured',
'Console log capture is not available (BiDi not supported by this browser/driver)'
);

registerBidiTool(
'get_page_errors',
'returns JavaScript errors and exceptions captured via WebDriver BiDi. Includes stack traces when available. Essential for diagnosing why a page is broken or a feature isn\'t working.',
'pageErrors',
'No page errors captured',
'Page error capture is not available (BiDi not supported by this browser/driver)'
);

registerBidiTool(
'get_network_logs',
'returns network activity (completed responses and failed requests) captured via WebDriver BiDi. Shows HTTP status codes, URLs, methods, and error details. Useful for diagnosing failed API calls and broken resources.',
'networkLogs',
'No network activity captured',
'Network log capture is not available (BiDi not supported by this browser/driver)'
);

// Resources
server.resource(
"browser-status",
Expand Down Expand Up @@ -986,6 +1118,7 @@ async function cleanup() {
}
}
state.drivers.clear();
state.bidi.clear();
state.currentSession = null;
process.exit(0);
}
Expand Down
Loading