Web search
Models that ground their answers in live web results. Citations, query budgets, when to use it.
Some models can search the web during generation, ground their answers in live results, and return citations. Useful for "what happened today," news summarization, fact-checking, and any prompt where the model's training cutoff is the wrong answer. With the AI SDK, plain models reach the web through user-defined tools like Tavily or Exa.
Models with web search
Filter the catalog by the Web search capability on /models. Top picks:
| Model | Search style |
|---|---|
xai/grok-4 | Auto-enabled; integrated with X / web |
perplexity/sonar-pro | Citation-first; web-grounded by design |
perplexity/sonar | Cheaper Perplexity tier |
anthropic/claude-opus-4.6 | Web search beta; explicit opt-in per request |
google/gemini-2.5-pro | Google grounding; uses Search results |
openai/gpt-5.4 | Web search via tool use |
With xAI Grok (auto-enabled)
Grok decides on its own whether to search. No flag needed:
import { streamText } from "ai"
const result = streamText({
model: "xai/grok-4",
baseURL: "https://synapse.garden/api/v1",
apiKey: process.env.MG_KEY,
prompt: "What's the latest news about open-source LLMs?",
})
for await (const part of result.textStream) {
process.stdout.write(part)
}With Perplexity (always grounded)
Perplexity Sonar is search-first — every answer is grounded in retrieved web pages with explicit citations:
const result = streamText({
model: "perplexity/sonar-pro",
prompt: "Summarize recent advances in vector databases.",
providerOptions: {
perplexity: { searchMode: "web" }, // "web" | "academic"
},
})
const final = await result.text
const meta = await result.providerMetadata
console.log(meta?.perplexity?.citations)
// [{ url: "https://...", title: "..." }, ...]With Anthropic Claude (beta)
Claude's web search is a separate tool:
const result = streamText({
model: "anthropic/claude-opus-4.6",
prompt: "What new AI papers were published this week?",
providerOptions: {
anthropic: {
tools: [
{ type: "web_search_20250305", name: "web_search", max_uses: 3 },
],
},
},
})max_uses caps how many search calls Claude can make per prompt — defends against runaway exploration.
With Google Gemini (grounding)
Gemini's "grounding" is opt-in:
const result = streamText({
model: "google/gemini-2.5-pro",
prompt: "What's the population of Tokyo today?",
providerOptions: {
google: { useSearchGrounding: true },
},
})The response carries grounding chunks in providerMetadata.google.groundingMetadata.
With OpenAI (tool-use pattern)
OpenAI doesn't have native web search yet — use a tool with a search API of your choice:
import { tool } from "ai"
import { z } from "zod"
streamText({
model: "openai/gpt-5.4",
prompt: "What's the latest in quantum computing?",
tools: {
webSearch: tool({
description: "Search the web and return top 5 results",
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => {
const res = await fetch(`https://api.tavily.com/search?q=${encodeURIComponent(query)}`, {
headers: { Authorization: `Bearer ${process.env.TAVILY_KEY}` },
})
return await res.json()
},
}),
},
maxSteps: 3,
})This pattern works with any model that supports tools — bring your own search API (Tavily, Exa, Brave, SerpAPI, etc.).
Citations
Models that ground in web results return citations in their providerMetadata:
const meta = await result.providerMetadata
// Perplexity:
meta?.perplexity?.citations
// [{ url, title, snippet }, ...]
// Claude web_search tool:
meta?.anthropic?.toolUses
// [{ type: "web_search", input: { query }, results: [...] }, ...]
// Gemini grounding:
meta?.google?.groundingMetadata?.groundingChunks
// [{ web: { uri, title } }, ...]For citation-rich UX, render the citations alongside the answer with hover/click-through to the source URL.
Pricing
Web search is billed per search query, not per token. Browse /models for live rates. Typical:
- Perplexity Sonar — built into the per-token rate; no separate query fee
- Anthropic web search beta — ~$10 per 1,000 queries
- Google grounding — ~$35 per 1,000 grounded responses
- xAI Grok — auto-included; small surcharge for heavy search usage
Caveats
- Stale-by-a-day. Even "live" web search isn't real-time — the underlying search index typically lags by hours. For minute-by-minute data (stock prices, sports scores), use a dedicated API tool.
- Hallucinated citations. Models sometimes fabricate URLs or misattribute quotes. Always render citations as links and let the user click through.
- Geographic bias. Search results lean toward English / US sources unless you specify a region. Pass
country/langhints in your tool call when relevant. - Rate limits. Web search has separate quotas per provider — heavy use can throttle independent of token rate limits.
- Privacy. Your prompt is sent to the upstream search engine. If your prompt contains PII, redact before searching.
When NOT to use web search
- Factual questions older than the training cutoff — the model already knows. Adding search wastes tokens and adds latency.
- Reasoning-heavy questions — grounding doesn't help with "explain why" or "design a system." Use a reasoning model instead.
- Internal data — RAG over your own corpus (see Embeddings) is faster, cheaper, and more accurate than asking a model to "search the web for our company docs."