Generate Streaming Text Completions

curl --request POST \
  --url https://api.nugen.in/api/v3/inference/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "max_tokens": 400,
  "model": "nugen-flash-instruct",
  "prompt": "The sky is",
  "stream": false,
  "temperature": 1
}'

"<any>"

POST

api

inference

completions

Generate Streaming Text Completions

curl --request POST \
  --url https://api.nugen.in/api/v3/inference/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "max_tokens": 400,
  "model": "nugen-flash-instruct",
  "prompt": "The sky is",
  "stream": false,
  "temperature": 1
}'

"<any>"

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

string

required

The name of the model to use.

Example:

"nugen-flash-instruct"

prompt

required

The prompt to generate completions for. It can be a single string or a list of strings. It can also be an array of integers or an array of integer arrays, which allows to pass already tokenized prompt.

Example:

"The sky is"

max_tokens

integer | null

default:16

The maximum number of tokens to generate in the completion.

Required range: x >= 0

Example:

400

temperature

number | null

default:1

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Required range: 0 <= x <= 2

Example:

1

stream

boolean | null

default:false

Whether to stream back partial progress as server-sent events.

Response

Streaming text completion responses or complete response depending on stream parameter

The response is of type any.

Generate Vision-Language Chat Completions Generate Streaming Chat Completions

⌘I

Models

Agents

Documents

API Keys

Benchmark

Synthetic Data

Alignment Project

Inference

Generate Streaming Text Completions

Authorizations

Body

Response