
Overview
The Scrape service provides direct access to raw HTML content from web pages, with optional JavaScript rendering support. This service is perfect for applications that need the complete HTML structure of a webpage, including dynamically generated content.Try the Scrape service instantly in our interactive playground - no coding required!
Getting Started
Quick Start
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
apiKey | string | Yes | The ScrapeGraph API Key. |
websiteUrl | string | Yes | The URL of the webpage to scrape. |
render_heavy_js | boolean | No | Set to true for heavy JavaScript rendering. Default: false |
Get your API key from the dashboard
Example Response
Example Response
request_id
: Unique identifier for tracking your requeststatus
: Current status of the scraping operationhtml
: Raw HTML content of the webpageerror
: Error message (if any occurred during scraping)
Key Features
Raw HTML Access
Get complete HTML structure including all elements
JavaScript Rendering
Optional support for heavy JavaScript rendering
Fast Processing
Quick extraction for simple HTML content
Reliable Output
Consistent results across different websites
Use Cases
Web Development
- Extract HTML templates
- Analyze page structure
- Test website rendering
- Debug HTML issues
Data Analysis
- Parse HTML content
- Extract specific elements
- Monitor website changes
- Build web scrapers
Content Processing
- Process dynamic content
- Handle JavaScript-heavy sites
- Extract embedded data
- Analyze page performance
Want to learn more about our AI-powered scraping technology? Visit our main website to discover how we’re revolutionizing web data extraction.
JavaScript Rendering
Therender_heavy_js
parameter controls whether JavaScript should be executed on the target page: When to Use JavaScript Rendering
- Single Page Applications (SPAs): React, Vue, Angular apps
- Dynamic Content: Content loaded via AJAX/fetch
- Interactive Elements: Dropdowns, modals, infinite scroll
- Client-side Routing: Hash-based or history API routing
When to Skip JavaScript Rendering
- Static HTML Pages: Traditional server-rendered content
- Performance: Faster processing for simple pages
- Cost Optimization: Lower API usage for basic scraping
- Reliability: More predictable results for static content
Advanced Usage
Async Support
For applications requiring asynchronous execution, the Scrape service provides async support:Concurrent Processing
Process multiple URLs concurrently for better performance:Integration Options
Official SDKs
- Python SDK - Perfect for automation and data processing
- JavaScript SDK - Ideal for web applications and browser tools
AI Framework Integrations
- LangChain Integration - Use Scrape in your content pipelines
- LlamaIndex Integration - Create searchable knowledge bases
Best Practices
Performance Optimization
- Use
render_heavy_js=false
for static content - Process multiple URLs concurrently
- Cache results when possible
- Monitor API usage and costs
Error Handling
- Always check the
status
field - Handle network timeouts gracefully
- Implement retry logic for failed requests
- Log errors for debugging
Content Processing
- Validate HTML structure before parsing
- Handle different character encodings
- Extract only needed content sections
- Clean up HTML for further processing
Example Projects
Check out our cookbook for real-world examples:- Web scraping automation tools
- Content monitoring systems
- HTML analysis applications
- Dynamic content extractors
API Reference
For detailed API documentation, see:Support & Resources
Documentation
Comprehensive guides and tutorials
API Reference
Detailed API documentation
Community
Join our Discord community
GitHub
Check out our open-source projects
Main Website
Visit our official website
Ready to Start?
Sign up now and get your API key to begin scraping web content!