Start Crawl

POST /v1/crawl Start a new crawl job using SmartCrawler. Choose between AI-powered extraction or cost-effective markdown conversion.

Request Body

Content-Type: application/json

Schema

{  "url": "string",  "prompt": "string",  "extraction_mode": "boolean",  "cache_website": "boolean",  "depth": "integer",  "max_pages": "integer",  "same_domain_only": "boolean",  "batch_size": "integer",  "schema": { /* JSON Schema object */ } } 

Parameters

Parameter	Type	Required	Default	Description
url	string	Yes	-	The starting URL for the crawl
prompt	string	No*	-	Instructions for data extraction (*required when extraction_mode=true)
extraction_mode	boolean	No	true	When `false`, enables markdown conversion mode (NO AI/LLM processing, 2 credits per page)
cache_website	boolean	No	false	Whether to cache the website content
depth	integer	No	1	Maximum crawl depth
max_pages	integer	No	10	Maximum number of pages to crawl
same_domain_only	boolean	No	true	Whether to crawl only the same domain
batch_size	integer	No	1	Number of pages to process in each batch
schema	object	No	-	JSON Schema object for structured output

Example

{  "url": "https://scrapegraphai.com/",  "prompt": "What does the company do? and I need text content from there privacy and terms",  "cache_website": true,  "depth": 2,  "max_pages": 2,  "same_domain_only": true,  "batch_size": 1,  "schema": {  "$schema": "http://json-schema.org/draft-07/schema#",  "title": "ScrapeGraphAI Website Content",  "type": "object",  "properties": {  "company": {  "type": "object",  "properties": {  "name": { "type": "string" },  "description": { "type": "string" },  "features": {  "type": "array",  "items": { "type": "string" }  },  "contact_email": { "type": "string", "format": "email" },  "social_links": {  "type": "object",  "properties": {  "github": { "type": "string", "format": "uri" },  "linkedin": { "type": "string", "format": "uri" },  "twitter": { "type": "string", "format": "uri" }  },  "additionalProperties": false  }  },  "required": ["name", "description"]  },  "services": {  "type": "array",  "items": {  "type": "object",  "properties": {  "service_name": { "type": "string" },  "description": { "type": "string" },  "features": {  "type": "array",  "items": { "type": "string" }  }  },  "required": ["service_name", "description"]  }  },  "legal": {  "type": "object",  "properties": {  "privacy_policy": { "type": "string" },  "terms_of_service": { "type": "string" }  },  "required": ["privacy_policy", "terms_of_service"]  }  },  "required": ["company", "services", "legal"]  } } 

Markdown Conversion Example (No AI/LLM)

For cost-effective HTML to markdown conversion without AI processing:

{  "url": "https://scrapegraphai.com/",  "extraction_mode": false,  "depth": 2,  "max_pages": 5,  "same_domain_only": true } 

When extraction_mode: false, the prompt parameter is not required. This mode converts HTML to clean markdown with metadata extraction at only 2 credits per page (80% savings compared to AI mode).

Response

200 OK: Crawl started successfully. Returns { "task_id": "<task_id>" }. Use this task_id to retrieve the crawl result from the Get Crawl Result endpoint.
422 Unprocessable Entity: Validation error.

See the Get Crawl Result endpoint for the full response structure.

API Documentation

SmartScraper

SearchScraper

SmartCrawler

Sitemap

Markdownify

User

Start SmartCrawler

Start Crawl

Request Body

Schema

Parameters

Example

Markdown Conversion Example (No AI/LLM)

Response

API Documentation

SmartScraper

SearchScraper

SmartCrawler

Sitemap

Markdownify

User

​Start Crawl

​Request Body

​Schema

​Parameters

​Example

​Markdown Conversion Example (No AI/LLM)

​Response

Start Crawl

Request Body

Schema

Parameters

Example

Markdown Conversion Example (No AI/LLM)

Response