Image generation
Generate images from text prompts. Nano Banana, Flux, Recraft, Imagen — all via the AI SDK.
Image generation models on Synapse Garden split into two families with slightly different return shapes. The AI SDK normalizes both — pick the right function (multimodal generateText for the Nano Banana family, generateImage for image-only models) and the right model, and you're done.
Two families, two functions
| Function | Model family | Returns |
|---|---|---|
generateText | Nano Banana (google/gemini-3-pro-image, google/gemini-2.5-flash-image, OpenAI gpt-image-2) | result.files (array of Uint8Array) |
experimental_generateImage | Image-only models (bfl/flux-2-flex, recraft/recraft-v3, google/imagen-4.0-generate-001) | result.images (array with base64) |
The Nano Banana family is multimodal LLM territory — text and images both flow through generateText. The image-only models live behind experimental_generateImage. Use whichever the upstream model supports.
Nano Banana family
import { generateText } from "ai"
import fs from "node:fs"
const result = await generateText({
model: "google/gemini-3-pro-image",
baseURL: "https://synapse.garden/api/v1",
apiKey: process.env.MG_KEY,
prompt: "A serene mountain landscape at sunset, watercolor style.",
})
const image = result.files.find((f) => f.mediaType?.startsWith("image/"))
if (image) {
const ext = image.mediaType?.split("/")[1] ?? "png"
fs.writeFileSync(`output.${ext}`, image.uint8Array)
}Nano Banana models can take text + reference images as input, generate text and/or images, and reference earlier images in a conversation. Useful for editing, variation, and "show me what this looks like" workflows.
const result = await generateText({
model: "google/gemini-3-pro-image",
messages: [
{
role: "user",
content: [
{ type: "text", text: "Make this room cozier — add a fireplace and warm lighting." },
{ type: "image", image: fs.readFileSync("room.jpg") },
],
},
],
})Image-only models
import { experimental_generateImage as generateImage } from "ai"
import fs from "node:fs"
const result = await generateImage({
model: "bfl/flux-2-flex",
prompt: "A vibrant coral reef with tropical fish, photorealistic.",
aspectRatio: "16:9",
})
const buffer = Buffer.from(result.images[0].base64, "base64")
fs.writeFileSync("output.png", buffer)Common parameters
generateImage({
model: "bfl/flux-2-flex",
prompt: "...",
aspectRatio: "16:9", // "1:1" | "4:3" | "16:9" | "21:9" | "3:4" | "9:16" | "9:21"
size: "1024x1024", // alternative to aspectRatio
n: 4, // generate 4 variations
seed: 42, // deterministic — same seed + prompt = same image
})n and seed support varies per model. aspectRatio is the most portable.
Choosing a model
| Model | Style | Speed | Notes |
|---|---|---|---|
google/gemini-3-pro-image (Nano Banana Pro) | Photorealistic, painterly, illustration | Medium | Multimodal — accepts reference images |
google/gemini-2.5-flash-image (Nano Banana) | General purpose | Fast | Cheaper than Pro |
openai/gpt-image-2 | Stylized illustration | Medium | Strong typography rendering |
bfl/flux-2-flex | Photorealistic | Fast | Best photo realism in the catalog |
bfl/flux-2-pro | Photorealistic, ultra-detail | Slow | Higher cost; more detail |
recraft/recraft-v3 | Vector + raster, brand work | Medium | Strong at logos, illustrations |
google/imagen-4.0-generate-001 | Photorealistic | Medium | Google's flagship; strong on faces |
When in doubt, run the same prompt through bfl/flux-2-flex and google/gemini-3-pro-image and compare. They have very different aesthetic defaults.
Editing images
Some Nano Banana models accept an input image and edit it:
const result = await generateText({
model: "google/gemini-3-pro-image",
messages: [
{
role: "user",
content: [
{ type: "text", text: "Remove the background. Replace with a soft gradient." },
{ type: "image", image: fs.readFileSync("portrait.jpg") },
],
},
],
})For OpenAI's image models with edit/inpaint support, the OpenAI SDK route is more direct:
const file = await client.images.edit({
model: "openai/gpt-image-2",
image: fs.createReadStream("input.png"),
mask: fs.createReadStream("mask.png"), // alpha channel = where to edit
prompt: "Replace the sky with a sunset.",
n: 1,
size: "1024x1024",
})Saving the result
The two return shapes:
| Shape | How to save |
|---|---|
result.files[i].uint8Array (Nano Banana) | fs.writeFileSync(path, image.uint8Array) |
result.images[i].base64 (image-only) | fs.writeFileSync(path, Buffer.from(image.base64, "base64")) |
For uploading to a CDN, both shapes can be passed directly to the SDK of your storage provider (e.g. @vercel/blob's put() accepts a Buffer or Uint8Array).
import { put } from "@vercel/blob"
const { url } = await put(`generated/${Date.now()}.png`, image.uint8Array, {
access: "public",
contentType: "image/png",
})
console.log(url)Pricing
Image generation is billed per image for image-only models, and per request + per image for Nano Banana (which also charges for the text portion of the conversation). The model detail page on /models shows the live rate.
Rough ranges:
- Cheap (Nano Banana, Flux 2 Flex, Recraft) — $0.02–$0.08 per image
- Mid (Imagen, Flux Pro) — $0.04–$0.12 per image
- Premium (with editing, ultra-detail) — $0.10–$0.30 per image
Caveats
- Resolution caps vary by model. Most cap at 2048×2048; ask for larger and you'll get the cap silently.
- Faces, brands, IP — provider safety policies may refuse certain prompts. The error code in the response makes it explicit.
- Aspect ratios support varies.
1:1,4:3,16:9are universal; obscure ratios fall back to the closest supported. - Seeds aren't always honored. If you need reproducibility, log the response's
seedfield and re-use it. - Safety overlays — most models add an invisible watermark for provenance. This doesn't affect normal use; it's detectable by C2PA tools.