POST
/
inference
/
chat_vision

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
required

The name of the vision model to use.

messages
any[]
required

A list of messages comprising the conversation so far.

max_tokens
integer | null
default: 200

The maximum number of tokens to generate in the completion.

prompt_truncate_len
integer | null
default: 1500

The size to which to truncate chat prompts.

temperature
number | null
default: 1

What sampling temperature to use.

Response

200 - application/json
id
string
required

A unique identifier of the response.

created
number
required

The Unix time in seconds when the response was generated.

model
string
required

The model used for the chat completion.

choices
object[]
required

The list of chat completion choices.

usage
object | null

Usage statistics.

For streaming responses, usage field is included in the very last response chunk returned.

Note that returning usage for streaming requests is a popular LLM API extension. If you use any popular LLM SDK, you might access the field directly even if it's not present in the type signature in the SDK.

Was this page helpful?