API Reference

POST /v1/rerank

Cross-encoder reranking. Pass a query and candidates, get back relevance-ordered results with calibrated scores.

FIG.
FIG. 00 · POST /V1/RERANKQUERY × DOCS → SCORES

/v1/rerank is the second stage of high-quality retrieval. You give it a query and up to 1000 documents; it returns each document with a calibrated relevance score (01), optionally truncated to the top top_n. The AI SDK's embed gives you the first stage; Synapse Garden gives you the second stage on the same key.

FIG. 01RANK BY SCORE
SCHEMATIC
The cross-encoder reads `(query, doc_i)` pairs and emits a calibrated score per pair. Documents are reordered by score and truncated to `top_n` (or the full list if omitted). One request = one billed query, regardless of document count.

Request

curl https://synapse.garden/api/v1/rerank \
  -H "Authorization: Bearer $MG_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cohere/rerank-english-v3.0",
    "query": "How do I rotate an API key?",
    "documents": [
      "Spend caps return HTTP 402 when exceeded.",
      "API keys can be rotated under Keys → Rotate.",
      "Synapse Garden proxies 100+ language models."
    ],
    "top_n": 2
  }'

Body schema

FieldTypeRequiredNotes
modelstringyesprovider/model-id, e.g. cohere/rerank-english-v3.0
querystringyesThe search intent. Max 8 000 chars.
documentsstring[]yes1–1 000 candidate strings. Order is preserved on output unless you sort by score.
top_nintegernoTruncate to top n results. Defaults to all. Range 1–1 000.
userstringnoCaller-defined identifier (passes through). Max 256 chars.
providerOptionsobjectnoProvider-namespaced overrides (e.g. { cohere: { return_documents: true } }).

Headers

HeaderRequiredNotes
Authorization: Bearer mg_live_*yesProduction key. Sandbox keys use mg_test_*.
x-mg-idempotency-keynoULID/UUID. Replays for 24 h.
x-mg-trace-idnoSurfaces in OTEL spans + logs for correlation.

Response

{
  "id": "rrk_01J9Z...",
  "model": "cohere/rerank-english-v3.0",
  "results": [
    { "index": 1, "relevance_score": 0.927, "document": "API keys can be rotated under Keys → Rotate." },
    { "index": 0, "relevance_score": 0.131, "document": "Spend caps return HTTP 402 when exceeded." }
  ],
  "usage": { "search_units": 1 }
}
FieldTypeNotes
idstringServer-assigned, useful for support tickets.
modelstringEchoes the requested model.
results[].indexintegerPosition of this document in the original documents array.
results[].relevance_scorenumberCalibrated probability (01).
results[].documentstringEchoes the document text (omitted if you set providerOptions.cohere.return_documents: false).
usage.search_unitsintegerAlways 1 per request — reranking is billed per query, not per token or per document.

Errors

Same envelope as the rest of /v1/*:

{ "error": { "code": "BAD_REQUEST", "message": "documents must contain at least 1 item", "type": "invalid_request_error" } }
Statuserror.codeWhen
400BAD_REQUESTBody fails Zod validation.
401UNAUTHORIZEDMissing/invalid Authorization header.
402BUDGET_EXCEEDEDProject spend cap reached.
403MODEL_NOT_ALLOWEDModel is not on this project's allowlist.
429RATE_LIMITEDPer-key RPM/TPM exceeded.
5xxUPSTREAM_ERRORProvider failed; retry with backoff.

Limits

  • 1 000 documents per request — chunk above that.
  • 8 000-char query.
  • Documents longer than the model's per-doc input window are truncated upstream; chunk first if precision matters.