Skip to content

Crawler Stuck or Over Limit

Fix issues when Discovery/Ingestion seems stuck or you hit plan limits.

What’s happening

  • Large sites can take time to process.
  • Some pages are blocked (robots.txt, login).
  • Your plan limits may prevent more pages from being learned.

Quick checks

  • Keep the dashboard tab open during Ingestion.
  • Test your site is reachable in a normal browser tab.
  • Check your plan capacity in Plan limits and upgrades.

Steps

  1. Confirm Discovery finished

    • After “Add entire website,” you see a confirmation screen with Pages Found and your limits.
    • If you haven’t clicked Confirm, Ingestion won’t start.
  2. Check job progress

    • In Train Voice Agent, watch the Knowledge Base entries change from “processing” to “ready.”
    • Very large sites may take longer.
  3. Reduce scope if needed

    • Cancel the running job.
    • Restart with a smaller scope (begin at your main site URL and ensure internal links are clear).
    • Add critical pages first; add more later.
  4. Handle plan limits

  5. Retry unreachable pages

    • Some pages may be blocked or require login. Only public pages are included.
    • For one page, use Scrape a single page.

What you should see

  • New entries appearing in the Knowledge Base with status “processing,” then “ready.”
  • If you reduced scope, the job should complete faster.

Tips

  • Start from your homepage (https://example.com) so Discovery sees your main navigation.
  • Add high‑value pages first: FAQ, pricing, services.
  • For frequent updates, refresh only the pages that changed.

Troubleshooting

  • Discovery found more pages than allowed
    • On the confirmation screen, reduce the selection, or upgrade your plan.
  • Job won’t start
    • Make sure you clicked Confirm after Discovery.
  • Entries never become “ready”
    • Cancel and retry with fewer pages; check that the pages are publicly accessible.

Next steps

© 2025 Babelbeez. All rights reserved.