Building an Agent with Playwright CLI
Table of Contents
- Context: Three Integration Approaches
- Approach 1: Playwright CLI with a Custom Skill
- Approach 2: Using Community Skills
- Approach 3: Playwright Test Agents
- CI/CD Integration
- Tradeoffs and Limitations
- Conclusion
You have a Claude Code agent that writes code, runs tests, and manages your repository. Now you want it to open a browser, navigate your application, validate that the UI works, and generate test files --- all without burning through your token budget. Here is how to set it up.
Context: Three Integration Approaches
There are three ways to give a Claude Code agent browser automation capabilities using Playwright:
- Playwright CLI as a skill --- Install
@playwright/cliand create a SKILL.md that teaches the agent how to use it - Community-built skills --- Use pre-built skills like lackeyjb/playwright-skill or testdino-hq/playwright-skill
- Playwright test agents --- Use Playwright’s built-in Planner, Generator, and Healer subagents
Each approach serves different use cases. This post covers all three.
Approach 1: Playwright CLI with a Custom Skill
Installation
Install the CLI globally:
npm install -g @playwright/cli@latest
Verify the installation:
playwright-cli --help
If you need a specific browser:
playwright-cli install-browser --browser=chromium
playwright-cli install-browser --browser=firefox
Configuration
Create a playwright-cli.json in your project root:
{
"browser": {
"browserName": "chromium",
"launchOptions": {
"headless": true
}
},
"network": {
"allowedOrigins": ["https://your-app.com", "http://localhost:*"]
},
"timeouts": {
"action": 5000,
"navigation": 30000
},
"outputDir": "./test-output"
}
For local development with a visible browser, set headless: false. For CI/CD, keep it true.
Creating the Skill
Create the skill directory:
mkdir -p .claude/skills/playwright-cli
Create .claude/skills/playwright-cli/SKILL.md:
---
name: playwright-cli
description: Browser automation using Playwright CLI. Use this skill when the user asks to test a web page, validate UI behavior, take screenshots, fill forms, or generate Playwright tests from browser sessions.
---
# Playwright CLI Skill
You have access to browser automation through the `playwright-cli` command. Use it to interact with web pages, validate UI behavior, and generate test code.
## Core Workflow
1. Open a browser session: `playwright-cli open <url> --headed`
2. Take a snapshot to see elements: `playwright-cli snapshot`
3. Read the snapshot file to find element references (e.g., e15, e20)
4. Interact using element refs: `playwright-cli click e15`
5. Take screenshots for visual validation: `playwright-cli screenshot`
6. Close when done: `playwright-cli close`
## Key Commands
### Navigation
- `playwright-cli open <url>` --- Open URL in new browser
- `playwright-cli goto <url>` --- Navigate current page
- `playwright-cli go-back` / `playwright-cli go-forward`
- `playwright-cli reload`
### Interaction
- `playwright-cli click <ref>` --- Click element
- `playwright-cli fill <ref> "<text>"` --- Fill input field
- `playwright-cli type <ref> "<text>"` --- Type character by character
- `playwright-cli press <key>` --- Press keyboard key
- `playwright-cli check <ref>` / `playwright-cli uncheck <ref>`
- `playwright-cli select <ref> "<value>"`
- `playwright-cli hover <ref>`
- `playwright-cli upload <ref> <filepath>`
### Observation
- `playwright-cli snapshot` --- Save page structure as YAML
- `playwright-cli screenshot` --- Save screenshot as PNG
- `playwright-cli console` --- View console messages
- `playwright-cli network` --- View network requests
- `playwright-cli pdf` --- Generate PDF of page
### Session Management
- `playwright-cli -s=<name> open <url>` --- Named session
- `playwright-cli list` --- List active sessions
- `playwright-cli state-save <file>` --- Save auth state
- `playwright-cli state-load <file>` --- Restore auth state
### Debugging
- `playwright-cli tracing-start` / `playwright-cli tracing-stop`
- `playwright-cli video-start` / `playwright-cli video-stop`
## Important Guidelines
- Always run `playwright-cli snapshot` after navigation or significant page changes to get updated element references
- Read the snapshot YAML file to find the correct element reference before interacting
- Use `--headed` during development so the user can see the browser
- Use headless mode (default) for CI/CD and automated testing
- Save authentication state with `state-save` for reuse across sessions
- Screenshots are saved to `.playwright-cli/` by default
- When generating tests, walk through the user flow first, then use the recorded session to produce test code
Agent Configuration (CLAUDE.md)
Add browser automation instructions to your project’s CLAUDE.md:
## Browser Automation
This project uses Playwright CLI for browser automation. When asked to:
- Test a page or UI flow: use the playwright-cli skill
- Validate visual appearance: take screenshots with playwright-cli
- Generate E2E tests: walk through the flow with playwright-cli, then generate a test file
- Debug a frontend issue: open the page with --headed and inspect
The dev server runs on http://localhost:3000. Start it with `npm run dev` before opening the browser.
Always close browser sessions when done to free resources.
Example Session
Here is what a typical agent interaction looks like:
User: "Test the login flow on our app"
Agent actions:
1. npm run dev # Start dev server
2. playwright-cli open http://localhost:3000 --headed # Open browser
3. playwright-cli snapshot # See page structure
4. [reads YAML: e12=input "Email", e15=input "Password", e18=button "Sign In"]
5. playwright-cli fill e12 "test@example.com" # Fill email
6. playwright-cli fill e15 "password123" # Fill password
7. playwright-cli click e18 # Click sign in
8. playwright-cli snapshot # See result
9. [reads YAML: confirms dashboard loaded]
10. playwright-cli screenshot # Visual evidence
11. playwright-cli close # Clean up
Total token cost: approximately 500 tokens of commands + one snapshot read (~800 tokens). Compare to MCP’s approximately 8,000+ tokens for the same flow.
Approach 2: Using Community Skills
lackeyjb/playwright-skill (1.9k stars)
This is the most popular community skill. It takes a different approach --- instead of wrapping the CLI, it has Claude write and execute Playwright scripts directly through a custom executor.
Installation:
# Global installation
git clone https://github.com/lackeyjb/playwright-skill.git /tmp/pw-skill
mkdir -p ~/.claude/skills
cp -r /tmp/pw-skill/skills/playwright-skill ~/.claude/skills/
cd ~/.claude/skills/playwright-skill
npm run setup
rm -rf /tmp/pw-skill
Or project-specific:
mkdir -p .claude/skills
cp -r /tmp/pw-skill/skills/playwright-skill .claude/skills/
cd .claude/skills/playwright-skill
npm run setup
How it works:
- Claude receives a browser automation request
- Claude writes custom Playwright JavaScript/TypeScript code
- The skill’s
run.jsexecutor runs the code with proper module resolution - Browser launches visibly (headless: false, slowMo: 100ms by default)
- Results return as screenshots and console output
Key differences from CLI approach:
- Model-invoked (Claude decides when to use it)
- Writes full Playwright scripts rather than issuing individual CLI commands
- Includes an API_REFERENCE.md that loads only when needed (progressive disclosure)
- 314 lines of instructions vs. a persistent server
testdino-hq/playwright-skill
This skill focuses on teaching the agent to write production-quality tests. It contains 70+ guides organized into 5 skill packs covering locators, assertions, page object models, fixtures, and CI/CD configuration.
Installation:
npx skills add testdino-hq/playwright-skill --skill playwright-cli
Best for: Teams that want their agent to generate tests following specific patterns and best practices, rather than ad-hoc automation.
Approach 3: Playwright Test Agents
Playwright v1.56 (October 2025) introduced three specialized subagents that integrate directly with Claude Code. These are not replacements for CLI or MCP --- they are higher-level orchestrators that use Playwright underneath.
Setup
npx playwright init-agents --loop=claude
This creates agent definition files in your project. Regenerate whenever you update Playwright to get access to new tools and instructions.
The Three Agents
Planner Agent explores your application and produces Markdown test plans:
Input: "Generate a plan for guest checkout flow"
Output: specs/guest-checkout.md (detailed test scenarios with steps, data, and assertions)
The Planner navigates your live application through a real browser, identifies user paths and edge cases, and produces structured plans that both humans and the Generator agent can read.
Generator Agent transforms plans into executable tests:
Input: specs/guest-checkout.md
Output: tests/checkout/guest-checkout.spec.ts
The Generator does not just translate Markdown to code. It actively interacts with your application to verify that selectors work and assertions are valid. It applies best practices --- semantic locators, proper wait strategies, readable test names.
Healer Agent fixes failing tests:
Input: A failing test file
Process: Replays steps, inspects current UI, patches locators/waits
Output: Updated test file that passes
The Healer handles real-world maintenance --- when a button’s text changes, a loading spinner gets added, or a layout shifts. It re-runs the test until it passes, or marks it as skipped if the underlying functionality is genuinely broken.
Project Structure with Agents
your-project/
.github/ # Agent definitions (Markdown files)
specs/ # Human-readable test plans (Planner output)
guest-checkout.md
user-registration.md
tests/ # Generated Playwright tests
seed.spec.ts # Bootstrap test with environment setup
checkout/
guest-checkout.spec.ts # Generator output
auth/
registration.spec.ts
playwright.config.ts
Using Agents in Practice
The workflow is sequential:
1. "Plan tests for the checkout flow"
-> Planner explores the app, writes specs/checkout.md
2. "Generate tests from the checkout plan"
-> Generator reads specs/checkout.md, writes tests/checkout/
3. [Tests fail because a button changed]
"Heal the failing checkout tests"
-> Healer debugs, patches, re-runs until green
Customization
Agent definitions are Markdown files, making them easy to modify:
- Adjust the Planner’s exploration strategy with sample user stories
- Configure the Generator’s code style to match your naming conventions
- Tune the Healer’s fix strategies based on common failure patterns in your app
CI/CD Integration
Headless Mode for Pipelines
For CI/CD, run CLI in headless mode (the default):
# In your CI script
playwright-cli open https://staging.your-app.com
playwright-cli snapshot
playwright-cli screenshot
playwright-cli close
Or use the configuration file:
{
"browser": {
"launchOptions": { "headless": true }
}
}
GitHub Actions Example
name: E2E Tests
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm ci
- run: npm install -g @playwright/cli@latest
- run: npx playwright install --with-deps chromium
- run: npm run dev &
- run: npx playwright test
- uses: actions/upload-artifact@v4
if: failure()
with:
name: test-results
path: test-output/
Combining Approaches
The most effective CI/CD setup combines Playwright test agents for test generation and maintenance with CLI for token-efficient execution:
- Developers use Planner/Generator agents during development to create tests
- Healer agent runs post-merge to fix tests broken by UI changes
- Standard
npx playwright testruns the generated tests in CI - CLI is used by agents during code review to validate UI changes
Tradeoffs and Limitations
Security Considerations
When using Playwright CLI or skills with Claude Code, be aware that:
- Console output and page content may reach Anthropic’s servers during the agent session
- Screenshots remain local (saved to disk)
- Target development environments with dummy data, not production systems with sensitive information
Complexity Limits
Community feedback (from Hacker News discussions) identifies scenarios where AI-driven browser automation struggles:
- OAuth and complex authentication flows
- Long action chains requiring deep context
- Edge cases like date pickers, drag-and-drop, and complex file uploads
- These limitations apply to both CLI and MCP approaches
Learning Curve
LLMs may not be trained on Playwright CLI commands since the tool launched in early 2026. The skill definition is essential --- without it, agents may hallucinate commands or use incorrect arguments. This is why community skills exist: they bridge the gap between the model’s training data and the tool’s actual API.
Conclusion
Building a Claude Code agent with Playwright CLI is straightforward:
- Install
@playwright/cliglobally - Create a SKILL.md that teaches the agent the command set
- Add browser automation guidelines to your CLAUDE.md
- Use headless mode for CI/CD, headed mode for development
For teams that want pre-built solutions, lackeyjb/playwright-skill provides a battle-tested skill with 1.9k GitHub stars. For teams focused on comprehensive test generation, Playwright’s built-in test agents (Planner, Generator, Healer) offer a higher-level workflow that handles the full lifecycle from planning through maintenance.
The key insight across all three approaches: browser automation for AI agents works best when it operates through the tools the agent already knows --- shell commands, file reads, and Markdown instructions --- rather than through a separate protocol layer.