Reranker API

Improve retrieval quality by scoring and reordering candidate documents based on their relevance to a query.

Use reranking as a final step after initial search to surface the best context for RAG and question answering.


POST/v1/rerank

Rerank documents

Rerank a list of candidate documents by relevance to a query. Use reranking to improve search quality in RAG pipelines by sorting results based on semantic relevance.

Required attributes

  • Name
    model
    Type
    string
    Description

    The reranker model (or deployment ID/alias) to use for this request.

  • Name
    query
    Type
    string
    Description

    The search query used to score relevance.

  • Name
    documents
    Type
    array
    Description

    The candidate documents to rerank. Each item can be:

    • a string (document text), or
    • an object with text and optional metadata (e.g. id, title, metadata).

Optional attributes

  • Name
    top_n
    Type
    integer
    Description

    Number of top results to return. If omitted, all documents are returned with scores.

  • Name
    return_documents
    Type
    boolean
    Description

    If true, include the original document payload in each result.

  • Name
    truncate
    Type
    string
    Description

    How to handle long documents. Common values: "none", "start", "end". If omitted, the server default is used.

  • Name
    max_tokens_per_document
    Type
    integer
    Description

    Maximum tokens (or approximate token budget) to consider per document when scoring. If omitted, the server default is used.

  • Name
    metadata
    Type
    object
    Description

    Developer-defined metadata to attach to the request (key/value pairs).

  • Name
    user
    Type
    string
    Description

    A unique identifier representing your end-user (can help with abuse monitoring and analytics). If unsupported, it may be ignored.

  • Name
    extra_body
    Type
    object
    Description

    (Optional pass-through) Additional provider-specific parameters to forward without changing the request shape. If present, the platform merges this object into the request body sent to the reranker backend.

Request

POST
/v1/rerank
curl "$BASE_URL/v1/rerank" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-reranker-model-or-deployment-id",
    "query": "How do I deploy a model?",
    "documents": [
      { "id": "doc_1", "text": "Deployments let you run models behind an API endpoint..." },
      { "id": "doc_2", "text": "Billing lets you add balance and view usage..." },
      { "id": "doc_3", "text": "A GPU runtime is used to run training jobs..." }
    ],
    "top_n": 2,
    "return_documents": true
  }'

Response

{
  "id": "rerank_01ABCDEF234567890",
  "object": "list",
  "model": "your-reranker-model-or-deployment-id",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.92,
      "document": { "id": "doc_1", "text": "Deployments let you run models behind an API endpoint..." }
    },
    {
      "index": 2,
      "relevance_score": 0.41,
      "document": { "id": "doc_3", "text": "A GPU runtime is used to run training jobs..." }
    }
  ],
  "usage": {
    "prompt_tokens": 34,
    "total_tokens": 34
  }
}

Was this page helpful?