meta · MultimodalReleased 2025-04-05

llama-4-scout

meta/llama-4-scout id

Llama 4 Scout is the best multimodal model in the world in its class and is more powerful than our Llama 3 models, while fitting in a single H100 GPU. Additionally, Llama 4 Scout supports an industry-leading context window of up to 10M tokens.

Tool useChat

Type

Use llama-4-scout

# Drop-in OpenAI-compatible client
$ import { generateText } from 'ai'
$ 
$ const { text } = await generateText({
$   model: 'meta/llama-4-scout',
$   baseURL: 'https://synapse.garden/api/v1',
$   apiKey: process.env.MG_KEY,
$   prompt: 'Why is the sky blue?',
$ })

128K

CONTEXT WINDOW

8.2K

MAX OUTPUT

$0.187/M

INPUT · PER M

$0.726/M

OUTPUT · PER M

PRICING

List prices, every modality.

Rate	Per million tokens · USD
Input	$0.187/M
Output	$0.726/M

Honest list pricesHow we calculate prices

Other meta models

See all 9

Model

Input

Output

Context

Type

llama-3.1-70bAn update to Meta Llama 3 70B Instruct that includes an expanded 128K context length, multilinguality and improved reasoning capabilities.TOOLS
Input
$0.792/M
Output
$0.792/M
Context
128K
llama-3.1-8bAn update to Meta Llama 3 8B Instruct that includes an expanded 128K context length, multilinguality and improved reasoning capabilities.TOOLS
Input
$0.242/M
Output
$0.242/M
Context
128K
llama-3.2-11bInstruction-tuned image reasoning generative model (text + images in / text out) optimized for visual recognition, image reasoning, captioning and answering general questions about the image.TOOLS
Input
$0.176/M
Output
$0.176/M
Context
128K
llama-3.2-1bText-only model, supporting on-device use cases such as multilingual local knowledge retrieval, summarization, and rewriting.
Input
$0.110/M
Output
$0.110/M
Context
128K
llama-3.2-3bText-only model, fine-tuned for supporting on-device use cases such as multilingual local knowledge retrieval, summarization, and rewriting.
Input
$0.165/M
Output
$0.165/M
Context
128K
llama-3.2-90bInstruction-tuned image reasoning generative model (text + images in / text out) optimized for visual recognition, image reasoning, captioning and answering general questions about the image.TOOLS
Input
$0.792/M
Output
$0.792/M
Context
128K
llama-3.3-70bWhere performance meets efficiency. This model supports high-performance conversational AI designed for content creation, enterprise applications, and research, offering advanced language understanding capabilities, including text summarization, classification, sentiment analysis, and code generation.TOOLS
Input
$0.792/M
Output
$0.792/M
Context
128K
llama-4-maverickAs a general purpose LLM, Llama 4 Maverick contains 17 billion active parameters, 128 experts, and 400 billion total parameters, offering high quality at a lower price compared to Llama 3.3 70B.TOOLS
Input
$0.264/M
Output
$1.07/M
Context
128K

FAQ · LLAMA-4-SCOUT

Frequently asked

01 / 04

How do I call llama-4-scout from my code?

Use the OpenAI or Anthropic SDK and point baseURL at https://synapse.garden/api/v1. Set model: ‘meta/llama-4-scout’ and supply your Synapse Garden API key. No code changes beyond the base URL.

02 / 04

How much does llama-4-scout cost?

Input: $0.187/M per million tokens. Output: $0.726/M per million tokens. The free tier includes a million tokens every month at no cost.

03 / 04

What's the context window for llama-4-scout?

llama-4-scout supports a context window of 128K tokens, with a maximum output of 8.2K tokens.

04 / 04

Do I need a separate Anthropic or OpenAI account?

No. Synapse Garden is the single API surface — one key gives you OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, xAI, Cohere, and more. Billing, rate limits, and audit logs are unified.

READY

Try llama-4-scout in three minutes.

Sign up, create a key, drop our base URL into your existing client. The free tier includes a million tokens every month — no credit card.

Start free Quickstart