A Model Context Protocol (MCP) server for browser automation using Playwright, deployed on Google Cloud Run. This project provides browser automation capabilities through both REST API and MCP protocol endpoints.
-
π Browser Automation Tools:
- Navigate to URLs
- Take screenshots
- Click elements
- Fill form fields
- Read page content
-
π§ Dual Protocol Support:
- REST API endpoints for direct HTTP access
- MCP protocol for AI agent integration
-
βοΈ Cloud-Native Design:
- Deployed on Google Cloud Run
- Docker containerized
- Auto-scaling and serverless
-
π Security Features:
- Domain allowlist
- Google Cloud identity token authentication
- Audit logging
- Clone the repository:
git clone https://github.com/yourusername/browser-use-mcp.git cd browser-use-mcp
- Set up Python environment:
python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt playwright install chromium
- Run the MCP server:
uvicorn app:app --host 0.0.0.0 --port 8080
- Test with the chat interface:
# In another terminal python3 -m http.server 8090 # Open http://localhost:8090/chat_interface_v2.html
# Take a screenshot curl -X POST http://localhost:8080/mcp/tools/screenshot \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "full_page": false}' # Navigate to a URL curl -X POST http://localhost:8080/mcp/tools/navigate \ -H "Content-Type: application/json" \ -d '{"url": "https://github.com"}' # Read page content curl -X POST http://localhost:8080/mcp/tools/read \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "selector": "h1"}'
import asyncio import aiohttp async def take_screenshot(): async with aiohttp.ClientSession() as session: async with session.post( "http://localhost:8080/mcp/tools/screenshot", json={"url": "https://github.com", "full_page": False} ) as resp: result = await resp.json() # result contains base64 encoded screenshot asyncio.run(take_screenshot())
- Build and deploy:
gcloud builds submit --config cloudbuild.yaml
- Environment variables:
ALLOWED_DOMAINS
: Comma-separated list of allowed domainsLOG_LEVEL
: Logging level (default: INFO)
browser-use-mcp/ βββ app.py # Main FastAPI application βββ mcp_tools.py # Browser automation tool implementations βββ browser_manager.py # Centralized browser lifecycle management βββ security_middleware.py # Google Cloud logging integration βββ chat_interface_v2.html # Web-based chat interface βββ requirements.txt # Python dependencies βββ Dockerfile # Container configuration βββ cloudbuild.yaml # CI/CD pipeline βββ agents/ # ADK agent configurations
Capture a screenshot of a webpage.
{ "url": "https://example.com", "full_page": false }
Navigate to a URL and return page information.
{ "url": "https://example.com" }
Click an element on the page.
{ "url": "https://example.com", "selector": "button#submit", "return_screenshot": false }
Fill a form field with text.
{ "url": "https://example.com", "selector": "input[name='email']", "text": "user@example.com", "return_screenshot": false }
Extract text content from the page.
{ "url": "https://example.com", "selector": "article" }
python test_all_tools.py
docker build -t browser-use-mcp . docker run -p 8080:8080 browser-use-mcp
-
CORS errors in browser: The server includes CORS middleware. Make sure you're accessing from allowed origins.
-
Playwright browser issues: Ensure Chromium is installed:
playwright install chromium
-
Port already in use: Check for running processes:
lsof -i :8080
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with FastAPI
- Browser automation powered by Playwright
- MCP protocol implementation using FastMCP
- Deployed on Google Cloud Run