POST
/
api
/
v3
/
inference
/
chat
/
vision
Generate Vision-Language Chat Completions
curl --request POST \
  --url https://api.nugen.in/api/v3/inference/chat/vision \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "max_tokens": "2000",
  "messages": [
    {
      "content": [
        {
          "text": "Can you describe this image?",
          "type": "text"
        },
        {
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/c/c7/Asiatic_Lioness_with_around_30_days_old_cub.jpg"
          },
          "type": "image_url"
        }
      ],
      "role": "user"
    }
  ],
  "model": "nugen-flash-vision",
  "prompt_truncate_len": 1500,
  "temperature": 1
}'
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The image in document describes a flowchart in a patent for optimizing jet engine compressor.",
        "role": "assistant"
      }
    }
  ],
  "created": 16286546649.54,
  "id": "nugen-1234",
  "model": "nugen-flash-vision",
  "usage": {
    "completion_tokens": 7,
    "prompt_tokens": 5,
    "total_tokens": 12
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
required

The name of the vision model to use.

messages
any[]
required

A list of messages comprising the conversation so far.

Minimum length: 1
max_tokens
integer | null
default:2000

The maximum number of tokens to generate in the completion.

prompt_truncate_len
integer | null
default:1500

The size to which to truncate chat prompts.

temperature
number | null
default:1

What sampling temperature to use.

Required range: 0 <= x <= 2

Response

Generated multi-modal chat response with image understanding capabilities

id
string
required

A unique identifier of the response.

created
number
required

The Unix time in seconds when the response was generated.

model
string
required

The model used for the chat completion.

choices
Choice · object[]
required

The list of chat completion choices.

usage
object | null

Usage statistics.

For streaming responses, usage field is included in the very last response chunk returned.

Note that returning usage for streaming requests is a popular LLM API extension. If you use any popular LLM SDK, you might access the field directly even if it's not present in the type signature in the SDK.