How to Set Up and Use the Playwright MCP Server — Complete Guide
Step-by-step guide to browser automation with Playwright MCP. Navigate pages, fill forms, take screenshots, and run accessibility checks from your AI editor.
The Playwright MCP server gives your AI coding assistant a real browser. Not a simulated one, not a headless rendering engine pretending to be a browser -- an actual Chromium, Firefox, or WebKit instance that it can navigate, click, type into, screenshot, and inspect. It is the most capable browser automation MCP server available, with over 1.4 million weekly downloads and 20 tools that cover everything from basic navigation to network request inspection.
This guide walks through the full setup, explains how the server works under the hood, and covers practical workflows you can use immediately.
What Playwright MCP Exposes
The server provides 20 tools organized around browser interaction:
Navigation and page management:
browser_navigate-- go to a URLbrowser_navigate_back-- go back in historybrowser_tabs-- list, create, close, or switch between tabsbrowser_wait_for-- wait for text to appear, disappear, or a timeout to elapse
Element interaction:
browser_click-- click an element by referencebrowser_type-- type text into an input fieldbrowser_hover-- hover over an elementbrowser_press_key-- press a keyboard keybrowser_select_option-- choose an option from a dropdownbrowser_drag-- drag and drop between two elementsbrowser_fill_form-- fill multiple form fields at oncebrowser_file_upload-- upload files to a file inputbrowser_handle_dialog-- accept or dismiss browser dialogs (alerts, confirms, prompts)
Page inspection:
browser_snapshot-- capture the accessibility tree of the current pagebrowser_take_screenshot-- take a visual screenshot (PNG or JPEG)browser_console_messages-- retrieve console log outputbrowser_network_requests-- list all network requests since page loadbrowser_evaluate-- run arbitrary JavaScript on the page
Browser management:
browser_resize-- change the viewport dimensionsbrowser_close-- close the browserbrowser_install-- install the browser binary if missing
That is a comprehensive set. Most browser automation tasks you would do manually -- navigating to a page, filling out a form, checking if something rendered correctly, debugging a network call -- are covered by these tools without workarounds.
The Key Insight: Accessibility Snapshots Over Screenshots
This is the single most important thing to understand about Playwright MCP, and it is what separates it from every other browser automation server.
By default, Playwright MCP does not use screenshots to understand a page. It reads the accessibility tree -- the same structured representation that screen readers use. Every interactive element on the page gets a unique reference, a role (button, link, textbox, heading), and a label. Your AI assistant sees something like this:
- heading "Sign Up" [level=1]
- textbox "Email address" [ref=e1]
- textbox "Password" [ref=e2]
- button "Create account" [ref=e3]
- link "Already have an account?" [ref=e4]
When the assistant wants to click the "Create account" button, it uses ref=e3. There is no guessing, no pixel coordinate estimation, no fragile CSS selectors. The interaction is deterministic -- it targets the exact element every time.
This approach is more reliable than screenshot-based interaction (where a vision model interprets pixel data and estimates click coordinates) and more robust than CSS selectors (which break when class names change or when elements are dynamically generated). It is also more token-efficient: a structured text snapshot is far smaller than a base64-encoded screenshot image.
Playwright MCP does support an optional vision mode (enabled with --caps=vision) for cases where you need the AI to see the actual visual rendering. But for most automation tasks, the accessibility snapshot is faster, cheaper, and more reliable.
Setup
Playwright MCP requires no API keys and no environment variables. It runs locally on your machine and launches a browser instance on demand. You need Node.js installed -- that is the only prerequisite.
Claude Code
Add to your project's .mcp.json or configure via the CLI:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp"]
}
}
}
Cursor
Create or edit .cursor/mcp.json in your project root:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp"]
}
}
}
VS Code (Copilot)
Add to your VS Code settings.json or workspace settings:
{
"mcp": {
"servers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp"]
}
}
}
}
Optional flags
You can pass additional flags to customize behavior:
"args": ["-y", "@playwright/mcp", "--browser", "firefox"]
--browser chromium|firefox|webkit-- choose the browser engine (default: chromium)--headless-- run without a visible browser window--caps=vision-- enable vision mode (screenshot-based interaction alongside snapshots)--caps=pdf-- enable PDF generation capabilities--port <number>-- run as an SSE server instead of stdio
Workflow 1: Test a Signup Form End-to-End
This is the most common use case. You have a signup form and want to verify it works.
Prompt your AI assistant:
"Navigate to localhost:3000/signup, fill in the email field with test@example.com and the password field with SecurePass123, click the submit button, and tell me what happens."
What the assistant does behind the scenes:
- Calls
browser_navigateto openhttp://localhost:3000/signup - Calls
browser_snapshotto read the page structure - Identifies the email and password fields by their accessibility labels
- Calls
browser_typeon each field (orbrowser_fill_formfor both at once) - Calls
browser_clickon the submit button - Calls
browser_snapshotagain to read the result - Reports whether the signup succeeded, showed a validation error, or redirected somewhere
The entire flow happens in seconds. You did not open a browser, you did not write a test script, and you did not switch windows. If the form has validation errors, the assistant reads them from the page structure and tells you exactly what went wrong.
Workflow 2: Check Responsive Layout at Multiple Breakpoints
Responsive design testing usually means manually dragging browser windows or opening DevTools and toggling device presets. With Playwright MCP, you can check all breakpoints in one conversation.
"Navigate to localhost:3000, resize the browser to 375x667 (iPhone SE), take a screenshot. Then resize to 768x1024 (iPad), take a screenshot. Then resize to 1440x900 (desktop), take a screenshot. Compare the layouts."
The assistant uses browser_navigate, then cycles through browser_resize and browser_take_screenshot for each breakpoint. You get three screenshots and a comparison of what changed between viewports -- all without leaving your editor.
For a deeper check, you can ask the assistant to take an accessibility snapshot at each breakpoint to verify that navigation elements collapse correctly, that buttons remain accessible, and that no content is hidden from the accessibility tree.
Workflow 3: Fill Out a Multi-Step Form and Verify Validation
Multi-step forms are tedious to test manually because you have to fill in every field correctly before you can proceed to the next step.
"Navigate to localhost:3000/onboarding. Fill out each step of the form with realistic test data. At each step, first try submitting with empty required fields to check validation messages, then fill in valid data and proceed. Report all validation messages you encounter."
The assistant works through each step systematically: it submits the empty form to trigger validation, reads the error messages from the accessibility snapshot, then fills in valid data and moves to the next step. At the end, you get a complete report of every validation message across every step -- something that would take several minutes to do manually.
The browser_fill_form tool is particularly useful here because it can fill multiple fields in a single call, reducing the number of round-trips.
Workflow 4: Debug with Screenshots and JavaScript Evaluation
When something looks wrong on a page and you need to understand the runtime state, Playwright MCP gives you two powerful inspection tools.
"Navigate to localhost:3000/dashboard. Take a screenshot of the page. Then evaluate document.querySelectorAll('.error-banner').length to check if there are any error banners. Also check what's in localStorage for the 'user' key."
The assistant navigates, takes a screenshot for visual context, and then uses browser_evaluate to run JavaScript directly in the page context. You can inspect DOM state, check localStorage or sessionStorage, read JavaScript variables, or call functions exposed on window. This is the equivalent of opening the browser console, but without switching windows.
You can also use browser_console_messages to retrieve any console.log, console.error, or console.warn output that the page generated. This is invaluable for debugging applications that log errors to the console without displaying them in the UI.
Workflow 5: Inspect Network Requests
When an API call fails silently or returns unexpected data, browser_network_requests shows you every request the page made since loading.
"Navigate to localhost:3000/dashboard and wait for it to fully load. Then show me all network requests, especially any that failed."
The assistant lists all requests with their URLs, methods, status codes, and response sizes. Failed requests (4xx or 5xx status codes) stand out immediately. This replaces the workflow of opening DevTools, switching to the Network tab, and scrolling through the request list.
Token Impact
Playwright MCP's 20 tools consume approximately 10,300 tokens in tool definitions. That is about 5% of Claude's 200K context window. It is on the heavier side compared to minimal servers, but consider what you get: full browser automation without a single line of test code.
If you are running Playwright MCP alongside other servers and feel the token pressure, a few options:
- Use
--headlessmode to avoid the overhead of rendering a visible window (this does not affect token cost, but it speeds up execution) - If you only need basic navigation and screenshots, consider whether a lighter server like Puppeteer MCP (8 tools, ~4,120 tokens) covers your needs
- Disable Playwright MCP in sessions where you are not doing any browser work
Common Gotchas
"Browser not installed"
The first time you run Playwright MCP, it may need to download the browser binary. If you see an error about the browser not being found, the assistant can call browser_install to download it automatically. You can also run this manually:
npx playwright install chromium
If you are using Firefox or WebKit, replace chromium with the appropriate browser name.
Headless vs headed mode
By default, Playwright MCP runs in headed mode -- you will see a browser window appear on your screen. This is useful for watching what the assistant does in real time. For CI environments or when you do not want the visual distraction, pass the --headless flag.
Pages behind authentication
If your application requires login, you need to have the assistant navigate to the login page and authenticate first. Playwright MCP does not persist sessions between restarts by default. For apps with complex authentication (OAuth, SSO), you may need to provide test credentials or use a pre-authenticated session state.
Single browser context
Playwright MCP manages a single browser context. If you need to test scenarios that require multiple independent sessions (like two users interacting with each other), you will need to handle that sequentially rather than in parallel.
Timeouts on slow pages
If a page takes a long time to load or render, the assistant may try to interact with elements before they are available. Using browser_wait_for with the expected text or element is the reliable way to handle this. Instruct the assistant to wait for specific content to appear before proceeding.
When to Use Playwright MCP vs Alternatives
Playwright MCP is the right choice when you need robust, reliable browser automation with broad tool coverage. Cross-browser testing, accessibility-tree-based interaction, network inspection, tab management, and form filling make it the default for most developers.
Puppeteer MCP (Anthropic, 8 tools, ~4,120 tokens) makes sense if you need minimal browser automation with the smallest possible token footprint. It covers navigation, clicking, typing, screenshots, and JavaScript evaluation. Choose it when you are token-constrained and only need the basics.
Browserbase MCP (Browserbase, cloud-hosted) is the option when you need browsers running in the cloud rather than locally. Useful for CI/CD pipelines, testing from different geographic locations, or when you do not want to install browser binaries on your machine. It requires an API key and a Browserbase account.
For most development workflows -- testing your own application locally, debugging UI issues, verifying forms and interactions -- Playwright MCP is the clear winner. The accessibility snapshot approach alone justifies the token cost.
Get Started
Playwright MCP is one of the few MCP servers that delivers immediate value with zero configuration. No API keys, no accounts, no environment variables. Add three lines of JSON to your config, and your AI assistant can control a real browser.
For a complete setup that pairs Playwright with complementary servers for your workflow, check out the QA Testing Stack or the Frontend Developer Stack on stackmcp.dev. You can also see detailed server specs on the Playwright MCP server page.