Generate Vision-Language Chat Completions

curl --request POST \
  --url https://api.nugen.in/api/v3/inference/chat/vision \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "max_tokens": "2000",
  "messages": [
    {
      "content": [
        {
          "text": "Can you describe this image?",
          "type": "text"
        },
        {
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/c/c7/Asiatic_Lioness_with_around_30_days_old_cub.jpg"
          },
          "type": "image_url"
        }
      ],
      "role": "user"
    }
  ],
  "model": "nugen-flash-vision",
  "prompt_truncate_len": 1500,
  "temperature": 1
}'

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The image in document describes a flowchart in a patent for optimizing jet engine compressor.",
        "role": "assistant"
      }
    }
  ],
  "created": 16286546649.54,
  "id": "nugen-1234",
  "model": "nugen-flash-vision",
  "usage": {
    "completion_tokens": 7,
    "prompt_tokens": 5,
    "total_tokens": 12
  }
}

POST

api

inference

chat

vision

Generate Vision-Language Chat Completions

curl --request POST \
  --url https://api.nugen.in/api/v3/inference/chat/vision \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "max_tokens": "2000",
  "messages": [
    {
      "content": [
        {
          "text": "Can you describe this image?",
          "type": "text"
        },
        {
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/c/c7/Asiatic_Lioness_with_around_30_days_old_cub.jpg"
          },
          "type": "image_url"
        }
      ],
      "role": "user"
    }
  ],
  "model": "nugen-flash-vision",
  "prompt_truncate_len": 1500,
  "temperature": 1
}'

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The image in document describes a flowchart in a patent for optimizing jet engine compressor.",
        "role": "assistant"
      }
    }
  ],
  "created": 16286546649.54,
  "id": "nugen-1234",
  "model": "nugen-flash-vision",
  "usage": {
    "completion_tokens": 7,
    "prompt_tokens": 5,
    "total_tokens": 12
  }
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

string

required

The name of the vision model to use.

messages

any[]

required

A list of messages comprising the conversation so far.

Minimum length: 1

max_tokens

integer | null

default:2000

The maximum number of tokens to generate in the completion.

prompt_truncate_len

integer | null

default:1500

The size to which to truncate chat prompts.

temperature

number | null

default:1

What sampling temperature to use.

Required range: 0 <= x <= 2

Response

Generated multi-modal chat response with image understanding capabilities

string

required

A unique identifier of the response.

created

number

required

The Unix time in seconds when the response was generated.

model

string

required

The model used for the chat completion.

choices

Choice · object[]

required

The list of chat completion choices.

Show child attributes

usage

object | null

Usage statistics.

For streaming responses, usage field is included in the very last response chunk returned.

Note that returning usage for streaming requests is a popular LLM API extension. If you use any popular LLM SDK, you might access the field directly even if it's not present in the type signature in the SDK.

Show child attributes

Generate Text Embeddings Generate Streaming Text Completions

⌘I

Models

Agents

Documents

API Keys

Benchmark

Synthetic Data

Alignment Project

Inference

Generate Vision-Language Chat Completions

Authorizations

Body

Response