POST
/
inference
/
chat_vision

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
messages
any[]
required

A list of messages comprising the conversation so far.

model
string
required

The name of the vision model to use.

max_tokens
integer | null
default:
2000

The maximum number of tokens to generate in the completion.

prompt_truncate_len
integer | null
default:
1500

The size to which to truncate chat prompts.

temperature
number | null
default:
1

What sampling temperature to use.

Required range: 0 < x < 2

Response

200 - application/json
choices
object[]
required

The list of chat completion choices.

created
number
required

The Unix time in seconds when the response was generated.

id
string
required

A unique identifier of the response.

model
string
required

The model used for the chat completion.

usage
object | null

Usage statistics.

For streaming responses, usage field is included in the very last response chunk returned.

Note that returning usage for streaming requests is a popular LLM API extension. If you use any popular LLM SDK, you might access the field directly even if it's not present in the type signature in the SDK.

Was this page helpful?