Agent Mode
Bounded LLM loop with a crawl tool surface.
What it is
Agent Mode (Mode B) is an LLM-backed execution loop built into GrubCrawler. You give it a goal in natural language; it plans which pages to crawl, executes them, observes the results, and keeps going until the goal is met or a stop condition is hit.
The loop runs plan → execute → observe cycles. Each cycle the LLM sees the current context and decides what to do next. Tools are the same crawl tools the API exposes — the agent just calls them programmatically.
POST/api/agent/run
Submit a task for the agent to execute.
Request
{
"task": "Find the pricing page for example.com and extract all plan names and prices.",
"start_url": "https://example.com",
"customer_id": "optional",
"config": {
"max_steps": 10,
"max_wall_time_ms": 60000,
"provider": "anthropic",
"model": "claude-opus-4-7"
}
}
Response (when complete)
{
"run_id": "uuid",
"status": "completed",
"stop_reason": "goal_met",
"result": "Found 3 plans: Starter ($9/mo), Pro ($29/mo), Enterprise (custom).",
"steps_taken": 4,
"wall_time_ms": 12400
}
Stop conditions
The agent stops when any of these triggers:
goal_metLLM declares the task completemax_stepsStep limit reached (default 10)max_wall_time_msWall clock limit hit (default 60s)max_failuresToo many consecutive tool failurespolicy_deniedA tool call was blocked by policy
Ghost Protocol
When a crawl returns blocked or thin content — a Cloudflare interstitial, empty page, or bot detection wall — the agent automatically triggers Ghost Protocol. It takes a screenshot of the page and asks the vision LLM to extract content from the image directly, bypassing the DOM entirely.
Ghost Protocol activates when:
thin contentReturned content is under the minimum content thresholdblock signalA bot detection page or CAPTCHA is detectedsolver failureThe challenge solver fails after retries
GET/api/agent/runs/{run_id}
Retrieve a completed run's full trace and result.
Agent Mode requires an LLM provider key. Set ANTHROPIC_API_KEY or OPENAI_API_KEY in the environment. The default provider is Anthropic.