Skip to main content
POST
/
v1
/
crawl

Start Crawl

POST /v1/crawl Start a new crawl job using SmartCrawler. Choose between AI-powered extraction or cost-effective markdown conversion.

Request Body

Content-Type: application/json

Schema

{  "url": "string",  "prompt": "string",  "extraction_mode": "boolean",  "cache_website": "boolean",  "depth": "integer",  "max_pages": "integer",  "same_domain_only": "boolean",  "batch_size": "integer",  "schema": { /* JSON Schema object */ } } 

Parameters

ParameterTypeRequiredDefaultDescription
urlstringYes-The starting URL for the crawl
promptstringNo*-Instructions for data extraction (*required when extraction_mode=true)
extraction_modebooleanNotrueWhen false, enables markdown conversion mode (NO AI/LLM processing, 2 credits per page)
cache_websitebooleanNofalseWhether to cache the website content
depthintegerNo1Maximum crawl depth
max_pagesintegerNo10Maximum number of pages to crawl
same_domain_onlybooleanNotrueWhether to crawl only the same domain
batch_sizeintegerNo1Number of pages to process in each batch
schemaobjectNo-JSON Schema object for structured output

Example

{  "url": "https://scrapegraphai.com/",  "prompt": "What does the company do? and I need text content from there privacy and terms",  "cache_website": true,  "depth": 2,  "max_pages": 2,  "same_domain_only": true,  "batch_size": 1,  "schema": {  "$schema": "http://json-schema.org/draft-07/schema#",  "title": "ScrapeGraphAI Website Content",  "type": "object",  "properties": {  "company": {  "type": "object",  "properties": {  "name": { "type": "string" },  "description": { "type": "string" },  "features": {  "type": "array",  "items": { "type": "string" }  },  "contact_email": { "type": "string", "format": "email" },  "social_links": {  "type": "object",  "properties": {  "github": { "type": "string", "format": "uri" },  "linkedin": { "type": "string", "format": "uri" },  "twitter": { "type": "string", "format": "uri" }  },  "additionalProperties": false  }  },  "required": ["name", "description"]  },  "services": {  "type": "array",  "items": {  "type": "object",  "properties": {  "service_name": { "type": "string" },  "description": { "type": "string" },  "features": {  "type": "array",  "items": { "type": "string" }  }  },  "required": ["service_name", "description"]  }  },  "legal": {  "type": "object",  "properties": {  "privacy_policy": { "type": "string" },  "terms_of_service": { "type": "string" }  },  "required": ["privacy_policy", "terms_of_service"]  }  },  "required": ["company", "services", "legal"]  } } 

Markdown Conversion Example (No AI/LLM)

For cost-effective HTML to markdown conversion without AI processing:
{  "url": "https://scrapegraphai.com/",  "extraction_mode": false,  "depth": 2,  "max_pages": 5,  "same_domain_only": true } 
When extraction_mode: false, the prompt parameter is not required. This mode converts HTML to clean markdown with metadata extraction at only 2 credits per page (80% savings compared to AI mode).

Response

  • 200 OK: Crawl started successfully. Returns { "task_id": "<task_id>" }. Use this task_id to retrieve the crawl result from the Get Crawl Result endpoint.
  • 422 Unprocessable Entity: Validation error.
See the Get Crawl Result endpoint for the full response structure.
⌘I