Content Ingestion

After verifying your domain, you can ingest content from your website. The system supports three ingestion modes, each suited to different scenarios.

Sitemap Mode

The system reads your sitemap.xml and indexes all listed URLs. Best for sites with a comprehensive sitemap.

mode: "sitemap"
source_url: "https://example.com/sitemap.xml" (optional, auto-detected)

URL List Mode

Provide an explicit list of URLs to index. Best when you want to index specific pages only.

mode: "url_list"
urls: ["https://example.com/page-1", "https://example.com/page-2"]

Crawl Mode

The system crawls your site starting from the base URL, following internal links up to a configured depth. Best for sites without a sitemap.

mode: "crawl"
max_depth: 3 (configurable, default per plan)

Ingestion is subject to your plan's page and ingestion quotas. See Quotas & Caching for details.

Ingestion Lifecycle

Queued
Ingestion request accepted and waiting for a queue worker.
Processing
Content is being fetched and parsed from your site.
Completed
All pages have been indexed and are queryable via MCP.
Failed
Something went wrong. Check error details and retry if needed.

Once ingestion completes, your site's content is available via MCP Endpoints.

Need help? Contact support