AI Assistant

Web Tools

Claude Code provides two tools for accessing web content: WebFetchTool for retrieving and processing web pages, and WebSearchTool for searching the web using Claude's native search capabilities.

WebFetchTool

Fetches content from URLs and converts it to a readable format for processing.

Parameters

urlstringrequired

The URL to fetch content from. Must be a valid URL.

promptstringrequired

A prompt to run on the fetched content. Used to extract or summarize relevant information from the page.

HTML-to-Markdown Conversion

WebFetchTool converts HTML content to Markdown using the Turndown library for cleaner, more token-efficient processing. The Turndown instance is lazily initialized on first use to avoid loading the library (approximately 1.4MB retained heap) until needed.

Caching

Fetched URLs are cached with a 15-minute TTL using an LRU cache:

const CACHE_TTL_MS = 15 * 60 * 1000   // 15 minutes
const MAX_CACHE_SIZE_BYTES = 50 * 1024 * 1024  // 50MB max cache

A separate domain check cache avoids redundant preflight HTTP requests when fetching multiple paths on the same domain:

const DOMAIN_CHECK_CACHE = new LRUCache<string, true>({
  max: 128,
  ttl: 5 * 60 * 1000,  // 5 minutes
})

Pre-approved Hosts

Certain hosts are pre-approved and do not require user permission to fetch. The pre-approved list is maintained in src/tools/WebFetchTool/preapproved.ts and checked by hostname and path.

For non-pre-approved hosts, permission is requested per domain. Users can set allow or deny rules scoped to domain:<hostname>.

Domain Safety

Before fetching from a non-pre-approved domain, WebFetchTool performs a preflight check against the Anthropic API to verify the domain is safe. This prevents fetching from known-malicious hosts.

WebFetchTool will fail for authenticated or private URLs (Google Docs, Confluence, Jira, GitHub authenticated pages). The tool prompt instructs the model to look for specialized MCP tools that provide authenticated access instead.

Binary Content

When the response content type indicates binary data, the content is persisted to a temporary file on disk rather than returned inline. The tool returns the saved file path so other tools can process it.

Key Properties

  • Read-only: Yes
  • Concurrency-safe: Yes
  • Deferred: Yes (requires ToolSearch)
  • maxResultSizeChars: 100,000

WebSearchTool

Searches the web using Claude's native web search capability, which is a server-side tool provided by the Anthropic API.

Parameters

querystringrequired

The search query. Minimum 2 characters.

allowed_domainsstring[]

Only include results from these domains. Useful for restricting searches to specific documentation sites.

blocked_domainsstring[]

Never include results from these domains.

How It Works

WebSearchTool uses Claude's built-in web_search_20250305 server tool. It:

  1. Sends the search query to the Anthropic API with a web_search tool definition
  2. The API performs up to 8 searches per query (hardcoded maximum)
  3. Results are returned as a sequence of search result blocks and text commentary
  4. The tool parses these blocks into structured SearchResult objects containing titles and URLs
function makeToolSchema(input: Input): BetaWebSearchTool20250305 {
  return {
    type: 'web_search_20250305',
    name: 'web_search',
    allowed_domains: input.allowed_domains,
    blocked_domains: input.blocked_domains,
    max_uses: 8,
  }
}

Provider Support

WebSearchTool availability depends on the API provider:

Always available.

Key Properties

  • Read-only: Yes
  • Concurrency-safe: Yes
  • Deferred: Yes (requires ToolSearch)
  • maxResultSizeChars: 100,000

Key Source Files

  • src/tools/WebFetchTool/WebFetchTool.ts: WebFetch implementation and permission checks
  • src/tools/WebFetchTool/utils.ts: URL fetching, caching, HTML-to-Markdown conversion
  • src/tools/WebFetchTool/preapproved.ts: Pre-approved host list
  • src/tools/WebSearchTool/WebSearchTool.ts: WebSearch implementation using native search API