> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nugen.in/llms.txt
> Use this file to discover all available pages before exploring further.

# Rerank Documents

> Rerank documents based on semantic relevance to a query.


This endpoint computes semantic relevance scores for a list of documents against a query, returning them in descending order of relevance. Supports multiple reranking methods from fast bi-encoders to advanced hybrid approaches. Ideal for improving search results and retrieval-augmented generation (RAG) pipelines.


**Request Body:**

- `model`: Reranker model ID (required) - e.g., `aligned-model-01kmqm4nrn9fw6r`
- `query`: Search query or question text (required)
- `documents`: Array of document texts or objects to rerank (required) - Can be strings or dictionaries



**Returns:**

- `results`: Array of reranked documents with relevance scores, each containing:
  - `index`: Original position in input array
  - `relevance_score`: Computed relevance score (higher = more relevant)
  - `document` (optional): Document text or object (if requested)
- `meta`: Operation metadata containing:
  - `method`: Reranking method used
  - `deployed_model_id`: Model ID that was used
  - `model_cross` (optional): Cross-encoder model used (if applicable)
  - `stage1_top_k` (optional): Stage 1 top-k for hybrid methods


**Example Request (Fast Method):**

```json
POST /api/v3/inference/reranker
Headers: {"Authorization": "Bearer <api_key>"}

{
  "model": "aligned-model-01kmqm4nrn9fw6r",
  "query": "What is machine learning?",
  "documents": [
    "Machine learning is a subset of artificial intelligence that enables computers to learn from data.",
    "Python is a popular programming language for web development.",
    "Deep learning uses neural networks with multiple layers.",
    "The weather today is sunny and warm.",
    "Supervised learning requires labeled training data."
  ],
  "method": "fast",
  "top_n": 3
}
```


**Example Response:**

```json
{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.95,
      "document": "Machine learning is a subset of artificial intelligence that enables computers to learn from data."
    },
    {
      "index": 4,
      "relevance_score": 0.82,
      "document": "Supervised learning requires labeled training data."
    },
    {
      "index": 2,
      "relevance_score": 0.76,
      "document": "Deep learning uses neural networks with multiple layers."
    }
  ],
  "meta": {
    "method": "fast",
    "deployed_model_id": "bge-reranker-v2-m3"
  }
}
```


**Example Request (Hybrid Method):**

```json
POST /api/v3/inference/reranker
Headers: {"Authorization": "Bearer <api_key>"}

{
  "model": "aligned-model-01kmqm4nrn9fw6r",
  "query": "How do neural networks work?",
  "documents": [
    "Neural networks are computing systems inspired by biological neural networks.",
    "The stock market is volatile today.",
    "Backpropagation is used to train neural networks."
  ],
}
```


**Example Response (Hybrid Method):**

```json
{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.92,
      "document": "Neural networks are computing systems inspired by biological neural networks."
    },
    {
      "index": 2,
      "relevance_score": 0.88,
      "document": "Backpropagation is used to train neural networks."
    }
  ],
  "meta": {
    "deployed_model_id": "aligned-model-01kmqm4nrn9fw6r",
  }
}
```


**Use Cases:**

- **Search Enhancement**: Improve search result ranking by semantic relevance
- **RAG Pipelines**: Select most relevant context chunks for LLM prompts
- **Semantic Filtering**: Find documents most related to a specific topic
- **Question Answering**: Rank candidate answers by relevance to the question

**Notes:**

- Documents can be plain strings or structured objects (dictionaries)
- Documents are returned in descending order of relevance (most relevant first)
- Results are always sorted by relevance_score in descending order (highest first)
- Relevance scores typically range from 0.0 to 1.0



## OpenAPI

````yaml https://api.nugen.in/openapi-public.json post /api/v3/inference/reranker
openapi: 3.1.0
info:
  title: Nugen Intelligence API
  description: 'Nugen Intelligence : Powering Specialized Intelligence At Scale'
  version: 25.4.20
servers: []
security: []
paths:
  /api/v3/inference/reranker:
    post:
      tags:
        - Inference
      summary: Rerank Documents
      description: >-
        Rerank documents based on semantic relevance to a query.



        This endpoint computes semantic relevance scores for a list of documents
        against a query, returning them in descending order of relevance.
        Supports multiple reranking methods from fast bi-encoders to advanced
        hybrid approaches. Ideal for improving search results and
        retrieval-augmented generation (RAG) pipelines.



        **Request Body:**


        - `model`: Reranker model ID (required) - e.g.,
        `aligned-model-01kmqm4nrn9fw6r`

        - `query`: Search query or question text (required)

        - `documents`: Array of document texts or objects to rerank (required) -
        Can be strings or dictionaries




        **Returns:**


        - `results`: Array of reranked documents with relevance scores, each
        containing:
          - `index`: Original position in input array
          - `relevance_score`: Computed relevance score (higher = more relevant)
          - `document` (optional): Document text or object (if requested)
        - `meta`: Operation metadata containing:
          - `method`: Reranking method used
          - `deployed_model_id`: Model ID that was used
          - `model_cross` (optional): Cross-encoder model used (if applicable)
          - `stage1_top_k` (optional): Stage 1 top-k for hybrid methods


        **Example Request (Fast Method):**


        ```json

        POST /api/v3/inference/reranker

        Headers: {"Authorization": "Bearer <api_key>"}


        {
          "model": "aligned-model-01kmqm4nrn9fw6r",
          "query": "What is machine learning?",
          "documents": [
            "Machine learning is a subset of artificial intelligence that enables computers to learn from data.",
            "Python is a popular programming language for web development.",
            "Deep learning uses neural networks with multiple layers.",
            "The weather today is sunny and warm.",
            "Supervised learning requires labeled training data."
          ],
          "method": "fast",
          "top_n": 3
        }

        ```



        **Example Response:**


        ```json

        {
          "results": [
            {
              "index": 0,
              "relevance_score": 0.95,
              "document": "Machine learning is a subset of artificial intelligence that enables computers to learn from data."
            },
            {
              "index": 4,
              "relevance_score": 0.82,
              "document": "Supervised learning requires labeled training data."
            },
            {
              "index": 2,
              "relevance_score": 0.76,
              "document": "Deep learning uses neural networks with multiple layers."
            }
          ],
          "meta": {
            "method": "fast",
            "deployed_model_id": "bge-reranker-v2-m3"
          }
        }

        ```



        **Example Request (Hybrid Method):**


        ```json

        POST /api/v3/inference/reranker

        Headers: {"Authorization": "Bearer <api_key>"}


        {
          "model": "aligned-model-01kmqm4nrn9fw6r",
          "query": "How do neural networks work?",
          "documents": [
            "Neural networks are computing systems inspired by biological neural networks.",
            "The stock market is volatile today.",
            "Backpropagation is used to train neural networks."
          ],
        }

        ```



        **Example Response (Hybrid Method):**


        ```json

        {
          "results": [
            {
              "index": 0,
              "relevance_score": 0.92,
              "document": "Neural networks are computing systems inspired by biological neural networks."
            },
            {
              "index": 2,
              "relevance_score": 0.88,
              "document": "Backpropagation is used to train neural networks."
            }
          ],
          "meta": {
            "deployed_model_id": "aligned-model-01kmqm4nrn9fw6r",
          }
        }

        ```



        **Use Cases:**


        - **Search Enhancement**: Improve search result ranking by semantic
        relevance

        - **RAG Pipelines**: Select most relevant context chunks for LLM prompts

        - **Semantic Filtering**: Find documents most related to a specific
        topic

        - **Question Answering**: Rank candidate answers by relevance to the
        question


        **Notes:**


        - Documents can be plain strings or structured objects (dictionaries)

        - Documents are returned in descending order of relevance (most relevant
        first)

        - Results are always sorted by relevance_score in descending order
        (highest first)

        - Relevance scores typically range from 0.0 to 1.0
      operationId: reranker
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/RerankRequestSchema'
        required: true
      responses:
        '200':
          description: Return the re-ranked chunks based on similarity
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RerankResponseSchema'
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      security:
        - HTTPBearer: []
components:
  schemas:
    RerankRequestSchema:
      properties:
        model:
          type: string
          title: Model
          description: Model ID to use for reranking
          example: aligned-model-01kmqm4nrn9fw6r
        query:
          type: string
          title: Query
          description: Search query
          example:
            - What is Python?
        documents:
          items:
            anyOf:
              - type: string
              - additionalProperties: true
                type: object
          type: array
          title: Documents
          description: Documents to rank
          example:
            - - Python is a programming language
              - Java is a language
        method:
          type: string
          enum:
            - fast
            - standard
            - optimal
            - best
          title: Method
          description: >-
            Rerank method: fast (bi-encoder), standard (cross-encoder), optimal
            (hybrid bi+cross), best (subspace+attention)
          default: fast
        model_cross:
          anyOf:
            - type: string
            - type: 'null'
          title: Model Cross
          description: Cross-encoder model (required for standard/optimal methods)
        top_n:
          anyOf:
            - type: integer
              minimum: 1
            - type: 'null'
          title: Top N
          description: Return top N results
        stage1_top_k:
          type: integer
          minimum: 1
          title: Stage1 Top K
          description: Stage 1 top-k for hybrid methods (optimal/best)
          default: 100
        max_tokens_per_doc:
          type: integer
          minimum: 1
          title: Max Tokens Per Doc
          description: Max tokens per document
          default: 512
        batch_size:
          anyOf:
            - type: integer
              minimum: 1
            - type: 'null'
          title: Batch Size
          description: Batch size for processing
        return_documents:
          type: boolean
          title: Return Documents
          description: Include documents in response
          default: false
      type: object
      required:
        - model
        - query
        - documents
      title: RerankRequestSchema
    RerankResponseSchema:
      properties:
        results:
          items:
            $ref: '#/components/schemas/RerankResult'
          type: array
          title: Results
          description: Ranked results
        meta:
          $ref: '#/components/schemas/RerankMeta'
          description: Operation metadata
      type: object
      required:
        - results
        - meta
      title: RerankResponseSchema
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    RerankResult:
      properties:
        index:
          type: integer
          title: Index
          description: Document index in input
        relevance_score:
          type: number
          title: Relevance Score
          description: Relevance score
        document:
          anyOf:
            - type: string
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Document
          description: Document text (if requested)
      type: object
      required:
        - index
        - relevance_score
      title: RerankResult
    RerankMeta:
      properties:
        method:
          type: string
          title: Method
          description: Rerank method used
        deployed_model_id:
          type: string
          title: Deployed Model Id
          description: Model ID used
        model_cross:
          anyOf:
            - type: string
            - type: 'null'
          title: Model Cross
          description: Cross-encoder used (if any)
        stage1_top_k:
          anyOf:
            - type: integer
            - type: 'null'
          title: Stage1 Top K
          description: Stage 1 top-k for hybrid methods
      type: object
      required:
        - method
        - deployed_model_id
      title: RerankMeta
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
  securitySchemes:
    HTTPBearer:
      type: http
      scheme: bearer

````