Skip to content

Matt-MFG/browser-use-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Browser-Use MCP Server

A Model Context Protocol (MCP) server for browser automation using Playwright, deployed on Google Cloud Run. This project provides browser automation capabilities through both REST API and MCP protocol endpoints.

Features

  • 🌐 Browser Automation Tools:

    • Navigate to URLs
    • Take screenshots
    • Click elements
    • Fill form fields
    • Read page content
  • πŸ”§ Dual Protocol Support:

    • REST API endpoints for direct HTTP access
    • MCP protocol for AI agent integration
  • ☁️ Cloud-Native Design:

    • Deployed on Google Cloud Run
    • Docker containerized
    • Auto-scaling and serverless
  • πŸ”’ Security Features:

    • Domain allowlist
    • Google Cloud identity token authentication
    • Audit logging

Quick Start

Local Development

  1. Clone the repository:
git clone https://github.com/yourusername/browser-use-mcp.git cd browser-use-mcp
  1. Set up Python environment:
python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt playwright install chromium
  1. Run the MCP server:
uvicorn app:app --host 0.0.0.0 --port 8080
  1. Test with the chat interface:
# In another terminal python3 -m http.server 8090 # Open http://localhost:8090/chat_interface_v2.html

Using the MCP Server

REST API Examples

# Take a screenshot curl -X POST http://localhost:8080/mcp/tools/screenshot \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "full_page": false}' # Navigate to a URL curl -X POST http://localhost:8080/mcp/tools/navigate \ -H "Content-Type: application/json" \ -d '{"url": "https://github.com"}' # Read page content curl -X POST http://localhost:8080/mcp/tools/read \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "selector": "h1"}'

Python Client Example

import asyncio import aiohttp async def take_screenshot(): async with aiohttp.ClientSession() as session: async with session.post( "http://localhost:8080/mcp/tools/screenshot", json={"url": "https://github.com", "full_page": False} ) as resp: result = await resp.json() # result contains base64 encoded screenshot asyncio.run(take_screenshot())

Deployment

Google Cloud Run

  1. Build and deploy:
gcloud builds submit --config cloudbuild.yaml
  1. Environment variables:
    • ALLOWED_DOMAINS: Comma-separated list of allowed domains
    • LOG_LEVEL: Logging level (default: INFO)

Project Structure

browser-use-mcp/ β”œβ”€β”€ app.py # Main FastAPI application β”œβ”€β”€ mcp_tools.py # Browser automation tool implementations β”œβ”€β”€ browser_manager.py # Centralized browser lifecycle management β”œβ”€β”€ security_middleware.py # Google Cloud logging integration β”œβ”€β”€ chat_interface_v2.html # Web-based chat interface β”œβ”€β”€ requirements.txt # Python dependencies β”œβ”€β”€ Dockerfile # Container configuration β”œβ”€β”€ cloudbuild.yaml # CI/CD pipeline └── agents/ # ADK agent configurations 

Available Tools

Screenshot

Capture a screenshot of a webpage.

{ "url": "https://example.com", "full_page": false }

Navigate

Navigate to a URL and return page information.

{ "url": "https://example.com" }

Click

Click an element on the page.

{ "url": "https://example.com", "selector": "button#submit", "return_screenshot": false }

Fill

Fill a form field with text.

{ "url": "https://example.com", "selector": "input[name='email']", "text": "user@example.com", "return_screenshot": false }

Read

Extract text content from the page.

{ "url": "https://example.com", "selector": "article" }

Development

Running Tests

python test_all_tools.py

Building Docker Image

docker build -t browser-use-mcp . docker run -p 8080:8080 browser-use-mcp

Troubleshooting

Common Issues

  1. CORS errors in browser: The server includes CORS middleware. Make sure you're accessing from allowed origins.

  2. Playwright browser issues: Ensure Chromium is installed:

    playwright install chromium
  3. Port already in use: Check for running processes:

    lsof -i :8080

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

About

MCP server for browser automation using Playwright, deployed on Google Cloud Run

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published