Z.AI · Single modalityReleased 2026-01-19

glm-4.7-flash

zai/glm-4.7-flash id

GLM-4.7-Flash balances high performance with efficiency, making it the perfect lightweight deployment option. Beyond coding, it is also recommended for creative writing, translation, long-context tasks, and roleplay.

ReasoningTool use

Type

Use glm-4.7-flash

# Drop-in OpenAI-compatible client
$ import { generateText } from 'ai'
$ 
$ const { text } = await generateText({
$   model: 'zai/glm-4.7-flash',
$   baseURL: 'https://synapse.garden/api/v1',
$   apiKey: process.env.MG_KEY,
$   prompt: 'Why is the sky blue?',
$ })

200K

CONTEXT WINDOW

131K

MAX OUTPUT

$0.077/M

INPUT · PER M

$0.440/M

OUTPUT · PER M

PRICING

List prices, every modality.

Rate	Per million tokens · USD
Input	$0.077/M
Output	$0.440/M

Honest list pricesHow we calculate prices

Other Z.AI models

See all 15

Model

Input

Output

Context

Type

glm-5.2GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks.REASONINGTOOLS
Input
$1.65/M
Output
$4.95/M
Context
1M
glm-5.2-fastGLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks.REASONINGTOOLS
Input
$3.30/M
Output
$11.28/M
Context
1M
glm-5.1TOOLSREASONING
Input
$1.43/M
Output
$4.73/M
Context
202K
glm-5v-turboGLM-5V-Turbo is Z.AI’s first multimodal coding foundation model, built for vision-based coding tasks. It can natively process multimodal inputs such as images, video, and text, while also excelling at long-horizon planning, complex coding, and action execution. Deeply optimized for agent workflows, it works seamlessly with agents such as Claude Code and OpenClaw to complete the full loop of “understand the environment → plan actions → execute tasks”.REASONINGTOOLS
Input
$1.32/M
Output
$4.40/M
Context
200K
glm-5-turboGLM 5 Turbo is a foundation model deeply optimized for the OpenClaw scenario. It has been specifically optimized for the core requirements of OpenClaw tasks since the training phase, enhancing key capabilities such as tool invocation, command following, timed and persistent tasks, and long-chain execution.REASONINGTOOLS
Input
$1.32/M
Output
$4.40/M
Context
202.8K
glm-5GLM-5 is Zai’s new-generation flagship foundation model, designed for Agentic Engineering, capable of providing reliable productivity in complex system engineering and long-range Agent tasks. In terms of Coding and Agent capabilities, GLM-5 has achieved state-of-the-art (SOTA) performance in open source, with its usability in real programming scenarios approaching that of Claude Opus 4.5.REASONINGTOOLS
Input
$1.04/M
Output
$3.47/M
Context
202.8K
glm-4.7-flashx GLM-4.7-Flash balances high performance with efficiency, making it the perfect lightweight deployment option. REASONINGTOOLS
Input
$0.066/M
Output
$0.440/M
Context
200K
glm-4.7GLM-4.7 is Z.ai’s latest flagship model, with major upgrades focused on two key areas: stronger coding capabilities and more stable multi-step reasoning and execution.REASONINGTOOLS
Input
$0.660/M
Output
$2.42/M
Context
200K
glm-4.6As the latest iteration in the GLM series, GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications.TOOLSREASONING
Input
$0.660/M
Output
$2.42/M
Context
200K
glm-4.6vGLM-4.6V series are Z.ai’s iterations in a multimodal large language model. GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales.TOOLSREASONING
Input
$0.330/M
Output
$0.990/M
Context
128K
glm-4.6v-flashFor local deployment and low-latency applications. GLM-4.6V series are Z.ai’s iterations in a multimodal large language model. GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales.REASONINGTOOLS
Input
—
Output
—
Context
128K
glm-4.5vBuilt on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from GLM-4.1V-Thinking while achieving effective scaling through a powerful 106B-parameter MoE architecture.TOOLSREASONING
Input
$0.660/M
Output
$1.98/M
Context
66K

FAQ · GLM-4.7-FLASH

Frequently asked

01 / 04

How do I call glm-4.7-flash from my code?

Use the OpenAI or Anthropic SDK and point baseURL at https://synapse.garden/api/v1. Set model: ‘zai/glm-4.7-flash’ and supply your Synapse Garden API key. No code changes beyond the base URL.

02 / 04

How much does glm-4.7-flash cost?

Input: $0.077/M per million tokens. Output: $0.440/M per million tokens. The free tier includes a million tokens every month at no cost.

03 / 04

What's the context window for glm-4.7-flash?

glm-4.7-flash supports a context window of 200K tokens, with a maximum output of 131K tokens.

04 / 04

Do I need a separate Anthropic or OpenAI account?

No. Synapse Garden is the single API surface — one key gives you OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, xAI, Cohere, and more. Billing, rate limits, and audit logs are unified.

READY

Try glm-4.7-flash in three minutes.

Sign up, create a key, drop our base URL into your existing client. The free tier includes a million tokens every month — no credit card.

Start free Quickstart