Skip to content

webfetch: high failure rate from invalid/broken URLs — consider pre-validation #471

@anandgupta42

Description

@anandgupta42

Problem

webfetch is the #1 failing tool by volume — 934 failures in a single day, accounting for 55% of all tool failures. Breakdown:

Error Count
HTTP 404 (Not Found) 810
HTTP 403 (Forbidden) 52
Invalid URL 51
HTTP 429 (Rate Limited) 21

The model frequently generates URLs that don't exist (documentation pages, API endpoints, GitHub links with wrong paths). Each failed fetch wastes a tool call and the model often retries the same broken URL.

Impact

  • Inflates the overall failure rate (55% of all failures)
  • Wastes tool calls and tokens on retries
  • Degrades user experience when the agent keeps trying broken URLs

Suggested improvements

  1. URL validation before fetch — reject obviously malformed URLs at the tool level
  2. Cache 404s — if a URL returned 404, don't retry the same URL in the same session
  3. Rate limit handling — on 429, respect Retry-After header or back off automatically instead of reporting as a failure
  4. Consider reducing retry behavior — if the model gets a 404, the error message should clearly state the URL doesn't exist so the model doesn't retry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions