Agent Mode

Bounded LLM loop with a crawl tool surface.

What it is

Agent Mode (Mode B) is an LLM-backed execution loop built into GrubCrawler. You give it a goal in natural language; it plans which pages to crawl, executes them, observes the results, and keeps going until the goal is met or a stop condition is hit.

The loop runs plan → execute → observe cycles. Each cycle the LLM sees the current context and decides what to do next. Tools are the same crawl tools the API exposes — the agent just calls them programmatically.

POST/api/agent/run

Submit a task for the agent to execute.

Request

{
  "task": "Find the pricing page for example.com and extract all plan names and prices.",
  "start_url": "https://example.com",
  "customer_id": "optional",
  "config": {
    "max_steps": 10,
    "max_wall_time_ms": 60000,
    "provider": "anthropic",
    "model": "claude-opus-4-7"
  }
}

Response (when complete)

{
  "run_id": "uuid",
  "status": "completed",
  "stop_reason": "goal_met",
  "result": "Found 3 plans: Starter ($9/mo), Pro ($29/mo), Enterprise (custom).",
  "steps_taken": 4,
  "wall_time_ms": 12400
}

Stop conditions

The agent stops when any of these triggers:

  • goal_met LLM declares the task complete
  • max_steps Step limit reached (default 10)
  • max_wall_time_ms Wall clock limit hit (default 60s)
  • max_failures Too many consecutive tool failures
  • policy_denied A tool call was blocked by policy

Ghost Protocol

When a crawl returns blocked or thin content — a Cloudflare interstitial, empty page, or bot detection wall — the agent automatically triggers Ghost Protocol. It takes a screenshot of the page and asks the vision LLM to extract content from the image directly, bypassing the DOM entirely.

Ghost Protocol activates when:

  • thin content Returned content is under the minimum content threshold
  • block signal A bot detection page or CAPTCHA is detected
  • solver failure The challenge solver fails after retries

GET/api/agent/runs/{run_id}

Retrieve a completed run's full trace and result.

Agent Mode requires an LLM provider key. Set ANTHROPIC_API_KEY or OPENAI_API_KEY in the environment. The default provider is Anthropic.