Generate conversational responses with real-time streaming support for immediate response chunks. Conversations are automatically saved if enabled in user settings.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The name of the model to use.
"nugen-flash-instruct"
A list of messages comprising the conversation so far.
1The maximum number of tokens to generate in the completion.
The size to which to truncate chat prompts.
What sampling temperature to use, between 0 and 2.
0 <= x <= 2Whether to stream back partial progress as server-sent events.
List of tools/functions
'auto', 'none', or specific tool
Nucleus sampling
Top-k sampling
Number of completions
Reasoning configuration for the model
Streaming chat completion responses or complete response depending on stream parameter