StackMCP
Blog
·9 min read

How to Set Up and Use the Playwright MCP Server — Complete Guide

Step-by-step guide to browser automation with Playwright MCP. Navigate pages, fill forms, take screenshots, and run accessibility checks from your AI editor.

mcpplaywrighttestingbrowserhow-to

The Playwright MCP server gives your AI assistant a real browser -- Chromium, Firefox, or WebKit -- that it can navigate, click, type into, screenshot, and inspect. With 20 tools and over 1.4 million weekly npm downloads, it is the most capable browser automation MCP server available. No API keys, no accounts, no configuration beyond three lines of JSON.

TL;DR: Add @playwright/mcp to your editor config. The server uses accessibility tree snapshots (not screenshots) to interact with pages, making it faster, cheaper, and more reliable than vision-based alternatives. No API key required.

Playwright MCP | npm | Tools: 20 | ~10,300 tokens

graph LR
    A[Your Editor] --> B[AI Assistant]
    B --> C[Playwright MCP]
    C --> D[Real Browser]
    D --> E[Your App / Any Website]

What the Server Exposes

Navigation: browser_navigate, browser_navigate_back, browser_tabs, browser_wait_for

Interaction: browser_click, browser_type, browser_hover, browser_press_key, browser_select_option, browser_drag, browser_fill_form, browser_file_upload, browser_handle_dialog

Inspection: browser_snapshot, browser_take_screenshot, browser_console_messages, browser_network_requests, browser_evaluate

Management: browser_resize, browser_close, browser_install

The Key Insight: Accessibility Snapshots

Playwright MCP reads the accessibility tree -- the same structured representation screen readers use -- instead of interpreting screenshots with a vision model. Every element gets a unique reference:

- heading "Sign Up" [level=1]
- textbox "Email address" [ref=e1]
- textbox "Password" [ref=e2]
- button "Create account" [ref=e3]

When the assistant clicks "Create account", it uses ref=e3. No pixel guessing, no fragile CSS selectors, no base64 screenshot overhead. This approach is:

  • More reliable than screenshot-based interaction
  • More robust than CSS selectors that break with class name changes
  • More token-efficient since text is far smaller than encoded images

Vision mode is available (--caps=vision) for cases where visual rendering matters, but the accessibility snapshot handles most automation tasks.

Setup

No API keys. No environment variables. Just Node.js.

Claude Code

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp"]
    }
  }
}

See the full Claude Code setup guide.

Cursor

Create .cursor/mcp.json:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp"]
    }
  }
}

See the Cursor setup guide.

VS Code (Copilot)

{
  "mcp": {
    "servers": {
      "playwright": {
        "command": "npx",
        "args": ["-y", "@playwright/mcp"]
      }
    }
  }
}

Optional flags

"args": ["-y", "@playwright/mcp", "--browser", "firefox"]
  • --browser chromium|firefox|webkit -- choose engine (default: chromium)
  • --headless -- no visible browser window
  • --caps=vision -- enable screenshot-based interaction alongside snapshots
  • --caps=pdf -- enable PDF generation
  • --port <number> -- run as SSE server

Workflows

Test a signup form end-to-end

"Navigate to localhost:3000/signup, fill email with test@example.com, password with SecurePass123, click submit, and tell me what happens."

The assistant:

  1. browser_navigate to the URL
  2. browser_snapshot to read the page structure
  3. browser_fill_form to enter credentials
  4. browser_click on submit
  5. browser_snapshot again to read the result

The entire flow takes seconds. No browser window switching, no test scripts.

Check responsive layout at multiple breakpoints

"Navigate to localhost:3000, take screenshots at 375x667 (mobile), 768x1024 (tablet), and 1440x900 (desktop). Compare layouts."

  • Cycles through browser_resize + browser_take_screenshot for each breakpoint
  • Three screenshots and a layout comparison without leaving your editor

Debug with JavaScript evaluation

"Navigate to localhost:3000/dashboard. Take a screenshot. Check document.querySelectorAll('.error-banner').length and what's in localStorage for the 'user' key."

  • browser_take_screenshot for visual context
  • browser_evaluate runs JavaScript directly in the page context
  • browser_console_messages retrieves any console.error output the page generated

Inspect network requests

"Navigate to localhost:3000/dashboard, wait for full load, then show all network requests -- especially failures."

  • browser_network_requests lists all requests with URLs, methods, status codes
  • Failed requests (4xx, 5xx) stand out immediately
  • Replaces the DevTools Network tab workflow

Multi-step form validation

"Navigate to localhost:3000/onboarding. At each step, try submitting empty to check validation, then fill valid data and proceed. Report all validation messages."

  • Works through each step systematically using browser_fill_form
  • Reads error messages from accessibility snapshots
  • Complete validation report across all steps

Token Impact

At ~10,300 tokens (~5% of 200K), Playwright MCP is on the heavier side. Options if you feel token pressure:

  • Use --headless for faster execution (does not affect token cost)
  • If you only need basic navigation and screenshots, consider Puppeteer MCP vs Playwright MCP (8 tools, ~4,120 tokens)
  • Disable Playwright MCP in non-browser sessions

See how to cut MCP token costs for more strategies.

Common Gotchas

"Browser not installed"

First run may need the binary. The assistant can call browser_install, or run manually:

npx playwright install chromium

Headed vs headless

Default is headed -- a visible browser window appears. Use --headless for CI or to avoid visual distraction.

Pages behind authentication

The assistant must navigate to the login page and authenticate first. Sessions do not persist between restarts. For complex auth (OAuth, SSO), provide test credentials.

Single browser context

One browser context at a time. Multi-user scenarios must be tested sequentially, not in parallel.

Timeouts on slow pages

Use browser_wait_for to wait for specific text or elements before interacting. Instruct the assistant to wait for content to appear before proceeding.

For more debugging tips, see the MCP troubleshooting guide.

Playwright MCP vs Alternatives

  • Playwright MCP: Full coverage -- cross-browser, accessibility-tree interaction, network inspection, tabs, forms. Default choice for most developers.
  • Puppeteer MCP (8 tools, ~4,120 tokens): Minimal footprint, basics only. Choose when token-constrained. See the full comparison.
  • Browserbase MCP (cloud-hosted): Browsers in the cloud for CI/CD or geographic testing. Requires API key and account.

Pairing with Other Servers

  • Supabase MCP: Apply schema changes, then verify the frontend renders correctly
  • GitHub MCP: Push changes, then verify the preview deployment
  • Stripe MCP: Create products in Stripe, then test the checkout flow in a real browser

This server is particularly valuable for frontend developers and QA testing workflows.

Getting Started

Add three lines of JSON to your config and your assistant controls a real browser. No API keys, no accounts, no environment variables. Test with: "Navigate to https://example.com and take a screenshot."

For a complete setup, see the QA Testing Stack or Frontend Developer Stack.

Related Stacks

Related Servers