Web search

Models that ground their answers in live web results. Citations, query budgets, when to use it.

FIG.

FIG. 00 · WEB SEARCHQUERY ⇄ CITATIONS

Some models can search the web during generation, ground their answers in live results, and return citations. Useful for "what happened today," news summarization, fact-checking, and any prompt where the model's training cutoff is the wrong answer. With the AI SDK, plain models reach the web through user-defined tools like Tavily or Exa.

FIG. 01GROUND-AND-ANSWER

SCHEMATIC

The model emits a search query, the upstream provider (or your tool) fetches results, and the answer is generated grounded in those snippets. Citations land in `providerMetadata` so you can render them next to the answer.

Models with web search

Filter the catalog by the Web search capability on /models. Top picks:

Model	Search style
`xai/grok-4`	Auto-enabled; integrated with X / web
`perplexity/sonar-pro`	Citation-first; web-grounded by design
`perplexity/sonar`	Cheaper Perplexity tier
`anthropic/claude-opus-4.6`	Web search beta; explicit opt-in per request
`google/gemini-2.5-pro`	Google grounding; uses Search results
`openai/gpt-5.4`	Web search via tool use

With xAI Grok (auto-enabled)

Grok decides on its own whether to search. No flag needed:

import { streamText } from "ai"

const result = streamText({
  model: "xai/grok-4",
  baseURL: "https://synapse.garden/api/v1",
  apiKey: process.env.MG_KEY,
  prompt: "What's the latest news about open-source LLMs?",
})

for await (const part of result.textStream) {
  process.stdout.write(part)
}

With Perplexity (always grounded)

Perplexity Sonar is search-first — every answer is grounded in retrieved web pages with explicit citations:

const result = streamText({
  model: "perplexity/sonar-pro",
  prompt: "Summarize recent advances in vector databases.",
  providerOptions: {
    perplexity: { searchMode: "web" }, // "web" | "academic"
  },
})

const final = await result.text
const meta = await result.providerMetadata
console.log(meta?.perplexity?.citations)
// [{ url: "https://...", title: "..." }, ...]

With Anthropic Claude (beta)

Claude's web search is a separate tool:

const result = streamText({
  model: "anthropic/claude-opus-4.6",
  prompt: "What new AI papers were published this week?",
  providerOptions: {
    anthropic: {
      tools: [
        { type: "web_search_20250305", name: "web_search", max_uses: 3 },
      ],
    },
  },
})

max_uses caps how many search calls Claude can make per prompt — defends against runaway exploration.

With Google Gemini (grounding)

Gemini's "grounding" is opt-in:

const result = streamText({
  model: "google/gemini-2.5-pro",
  prompt: "What's the population of Tokyo today?",
  providerOptions: {
    google: { useSearchGrounding: true },
  },
})

The response carries grounding chunks in providerMetadata.google.groundingMetadata.

With OpenAI (tool-use pattern)

OpenAI doesn't have native web search yet — use a tool with a search API of your choice:

import { tool } from "ai"
import { z } from "zod"

streamText({
  model: "openai/gpt-5.4",
  prompt: "What's the latest in quantum computing?",
  tools: {
    webSearch: tool({
      description: "Search the web and return top 5 results",
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        const res = await fetch(`https://api.tavily.com/search?q=${encodeURIComponent(query)}`, {
          headers: { Authorization: `Bearer ${process.env.TAVILY_KEY}` },
        })
        return await res.json()
      },
    }),
  },
  maxSteps: 3,
})

This pattern works with any model that supports tools — bring your own search API (Tavily, Exa, Brave, SerpAPI, etc.).

Citations

Models that ground in web results return citations in their providerMetadata:

const meta = await result.providerMetadata

// Perplexity:
meta?.perplexity?.citations
// [{ url, title, snippet }, ...]

// Claude web_search tool:
meta?.anthropic?.toolUses
// [{ type: "web_search", input: { query }, results: [...] }, ...]

// Gemini grounding:
meta?.google?.groundingMetadata?.groundingChunks
// [{ web: { uri, title } }, ...]

For citation-rich UX, render the citations alongside the answer with hover/click-through to the source URL.

Pricing

Web search is billed per search query, not per token. Browse /models for live rates. Typical:

Perplexity Sonar — built into the per-token rate; no separate query fee
Anthropic web search beta — ~$10 per 1,000 queries
Google grounding — ~$35 per 1,000 grounded responses
xAI Grok — auto-included; small surcharge for heavy search usage

Caveats

Stale-by-a-day. Even "live" web search isn't real-time — the underlying search index typically lags by hours. For minute-by-minute data (stock prices, sports scores), use a dedicated API tool.
Hallucinated citations. Models sometimes fabricate URLs or misattribute quotes. Always render citations as links and let the user click through.
Geographic bias. Search results lean toward English / US sources unless you specify a region. Pass country / lang hints in your tool call when relevant.
Rate limits. Web search has separate quotas per provider — heavy use can throttle independent of token rate limits.
Privacy. Your prompt is sent to the upstream search engine. If your prompt contains PII, redact before searching.

When NOT to use web search

Factual questions older than the training cutoff — the model already knows. Adding search wastes tokens and adds latency.
Reasoning-heavy questions — grounding doesn't help with "explain why" or "design a system." Use a reasoning model instead.
Internal data — RAG over your own corpus (see Embeddings) is faster, cheaper, and more accurate than asking a model to "search the web for our company docs."

Web search

On this page