Embeddings API

Turn text into dense vectors you can use for semantic search, clustering, and retrieval-augmented generation (RAG).

Create embeddings for single inputs or batches and plug the results directly into your vector database or search stack.

POST/v1/embeddings

Create embeddings

Create vector embeddings for text (or token arrays) using an OpenAI-compatible Embeddings endpoint. Use embeddings for semantic search, RAG, clustering, classification, and deduplication.

Required attributes

Name
model
Type
string
Description
The embedding model (or deployment ID/alias) to use for this request.
Name
input
Type
string | array
Description
The input text to embed. You can provide:
- a single string (one embedding),
- an array of strings (one embedding per string),
- or an array of token ID arrays (e.g. [[1,2,3], [4,5,6]]) if your client uses token inputs.

Optional attributes

Name
encoding_format
Type
string
Description
The format to return embeddings in. Common values are "float" (default) or "base64".
Name
dimensions
Type
integer
Description
The number of dimensions for the output embeddings, for models that support dimension reduction.
Name
user
Type
string
Description
A unique identifier representing your end-user (can help with abuse monitoring and analytics). If unsupported, it may be ignored.
Name
metadata
Type
object
Description
Developer-defined metadata to attach to the request (key/value pairs).
Name
extra_body
Type
object
Description
(Optional pass-through) Additional provider-specific parameters to forward without changing the OpenAI-compatible payload. If present, the platform merges this object into the request body sent to the underlying embedding backend.

Request

POST

/v1/embeddings

curl "$BASE_URL/v1/embeddings" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-embedding-model-or-deployment-id",
    "input": "The quick brown fox jumps over the lazy dog.",
    "encoding_format": "float"
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0123, -0.0456, 0.0789]
    }
  ],
  "model": "your-embedding-model-or-deployment-id",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}