Web scraping API
for LLM applications

URL in, LLM-ready content out. Markdown by default, structured data on demand.

$ curl https://api.scrape2llm.com/jobs \
  -H "X-API-Key: s2l_..." \
  -d '{"url": "https://example.com"}'

{
  "job_id": "abc-123",
  "status": "queued"
}

What you get

Markdown by default

Clean, LLM-ready content out of the box. No HTML noise, no JavaScript, just text and structure.

Rich parse mode

Opt into ParsedDocument: raw HTML, link graph, and structured blocks for advanced pipelines.

Sync or webhook

Poll for status or pass a callback URL - works for both interactive queries and batch ingestion.

Selector targeting

Pinpoint a specific section with CSS selectors when full-page content is too much.

R2-backed HTML

Raw HTML offloaded to Cloudflare R2 - keep job records small, fetch raw bytes on demand.

Multiple keys per account

Rotate keys safely, label them per environment, revoke individually from the dashboard.

See it in action

Same API, three flavors. Pick your stack.

# Submit a job
curl -X POST https://api.scrape2llm.com/jobs \
  -H "X-API-Key: s2l_..." \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "parse": true}'

# Response
{"job_id":"abc-123","status":"queued"}

# Poll until ready
curl https://api.scrape2llm.com/jobs/abc-123 \
  -H "X-API-Key: s2l_..."

scrape2llm is part of the 2LLM Suite — focused APIs that turn messy inputs into LLM-ready data. Also try files2llm and html2media and html2reel and stream2llm for the rest.

Start scraping in under a minute.

Get your API key →