Skip to main content
POST
/
api
/
v3
/
evaluations
Create Evaluation
curl --request POST \
  --url https://api.nugen.in/api/v3/evaluations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_id": "model-xyz789",
  "benchmark_id": "benchmark-abc123",
  "custom_metrics": {},
  "model_id_2": "model-def456"
}
'
{
  "evaluation_id": "eval-abc123",
  "model_id": "model-xyz789",
  "status": "pending",
  "benchmark_id": "benchmark-abc123",
  "created_at": "2024-02-24T10:00:00Z",
  "message": "Evaluation created successfully"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model_id
string
required

ID of the model to evaluate

Example:

"model-xyz789"

benchmark_id
string
required

ID of existing benchmark from BenchmarkTask table

Example:

"benchmark-abc123"

custom_metrics
Custom Metrics · object

Custom metrics configuration

model_id_2
string | null

ID of second model for comparison mode (eval-compare)

Example:

"model-def456"

Response

Returns a unique identifier for the initiated evaluation along with the initial evaluation status. This endpoint starts an asynchronous evaluation process using a specified benchmark and model(s), allowing users to track progress and retrieve results once completed.

evaluation_id
string
required

Unique identifier for the evaluation

Example:

"eval-abc123"

model_id
string
required

ID of the model being evaluated

Example:

"model-xyz789"

status
string
required

Current status of the evaluation

Example:

"pending"

benchmark_id
string
required

Benchmark ID used

Example:

"benchmark-abc123"

created_at
string
required

timestamp of evaluation creation

Example:

"2024-02-24T10:00:00Z"

message
string
required

Status message

Example:

"Evaluation created successfully"