Rerank

curl --request POST \
  --url https://api.zeroentropy.dev/v1/models/rerank \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "query": "<string>",
  "top_n": 123,
  "documents": [
    "<string>"
  ],
  "latency": "fast"
}'

{
  "results": [
    {
      "index": 123,
      "relevance_score": 123
    }
  ]
}

Models

Rerank

Reranks the provided documents, according to the provided query.

The results will be sorted by descending order of relevance. For each document, the index and the score will be returned. The index is relative to the documents array that was passed in. The score is the query-document relevancy determined by the reranker model. The results will be returned in descending order of relevance.

Organizations will, by default, have a ratelimit of 2,500,000 bytes-per-minute. If this is exceeded, requests will be throttled into latency: "slow" mode, up to 10,000,000 bytes-per-minute. If even this is exceeded, you will get a 429 error. To request higher ratelimits, please contact founders@zeroentropy.dev or message us on Discord or Slack!

POST

models

rerank

Rerank

curl --request POST \
  --url https://api.zeroentropy.dev/v1/models/rerank \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "query": "<string>",
  "top_n": 123,
  "documents": [
    "<string>"
  ],
  "latency": "fast"
}'

{
  "results": [
    {
      "index": 123,
      "relevance_score": 123
    }
  ]
}

Authorizations

Authorization

string

header

required

The Authorization header must be provided in the format Bearer <your-api-key>.

You can get your API Key at the Dashboard!

Body

application/json

model

string

required

The model ID to use for reranking. Options are: ["zerank-2", "zerank-1", "zerank-1-small"]

query

string

required

The query to rerank the documents by.

documents

string[]

required

The list of documents to rerank. Each document is a string.

top_n

integer | null

If provided, then only the top n documents will be returned in the results array. Otherwise, n will be the length of the provided documents array.

latency

enum<string> | null

Whether the call will be inferenced "fast" or "slow". RateLimits for slow API calls are orders of magnitude higher, but you can expect >10 second latency. Fast inferences are guaranteed subsecond, but rate limits are lower. If not specified, first a "fast" call will be attempted, but if you have exceeded your fast rate limit, then a slow call will be executed. If explicitly set to "fast", then 429 will be returned if it cannot be executed fast.

Available options:

fast,

slow

Response

Successful Response

results

RerankResult · object[]

required

The results, ordered by descending order of relevance to the query.

Show child attributes

Get Status

⌘I

Models

Status

Collections

Documents

Queries

Rerank

Authorizations

Body

Response