API Reference

Base URL: https://api.scrape2llm.com

Authentication

Job endpoints use X-API-Key header with a key from your dashboard. Account endpoints (/me, /me/keys) use a Bearer JWT from Kinde — these are intended for the dashboard, not for API clients.

POST /jobs

Submit a URL for scraping. Returns a job ID immediately.

Body

  • url (string, required) — URL to scrape.
  • parse (bool, default false) — return ParsedDocument with raw HTML + link graph.
  • selector (string, optional) — CSS selector to pinpoint content.
  • callback_url (string, optional) — webhook target for async delivery.

Response (202)

{"job_id":"abc-123","status":"queued","url":"...","parse":false,"selector":null,"callback_url":null}

GET /jobs/{id}

Status and result for a job.

# queued / fetching
{"status":"queued"}
{"status":"fetching"}

# fetched (flat)
{"status":"fetched","result":{"url":"...","title":"...","content":"# ...","tier":"httpx"}}

# fetched (parse=true) — content + file_id for raw HTML
{"status":"fetched","result":{"url":"...","content":"...","file_id":"abc-123-html"}}

# failed
{"status":"failed","error_message":"..."}

GET /jobs/{id}/html

For parse=true jobs only. Streams the raw HTML from R2. Returns 404 for non-parse jobs or jobs without stored HTML.

GET /me

(Bearer JWT) Current user + list of active (non-revoked) API keys.

POST /me/keys

(Bearer JWT) Create a new API key.

# Request
{"label":"Production"}

# Response (201) — raw key shown ONCE
{"id":"...","label":"Production","key_prefix":"s2l_a1b2c3d4","key":"s2l_a1b2c3d4...","created_at":"..."}

DELETE /me/keys/{id}

(Bearer JWT) Soft-delete a key. Subsequent requests with that key return 401.

Errors

  • 401 — missing/invalid/revoked key, or bad JWT.
  • 404 — unknown job ID, or HTML requested for a non-parse job.
  • 422 — malformed request body.
  • 5xx — upstream failure. Job will be marked failed with an error message.