Skip to main content
POST
/
api
/
v3
/
evaluations
Create Evaluation
curl --request POST \
  --url https://api.nugen.in/api/v3/evaluations/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_id": "<string>",
  "benchmark_id": "<string>",
  "judge_provider": "anthropic",
  "custom_metrics": {},
  "model_id_2": "<string>"
}
'
{
  "evaluation_id": "<string>",
  "model_id": "<string>",
  "status": "<string>",
  "benchmark_id": "<string>",
  "judge_provider": "<string>",
  "created_at": "<string>",
  "message": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request schema for creating a new model evaluation

model_id
string
required

ID of the model to evaluate

benchmark_id
string
required

ID of existing benchmark from BenchmarkTask table

judge_provider
string
default:anthropic

Judge model provider (anthropic, openai, nugen)

custom_metrics
Custom Metrics · object

Custom metrics configuration

model_id_2
string | null

ID of second model for comparison mode (eval-compare)

Response

Successful Response

Response schema for evaluation creation

evaluation_id
string
required

Unique identifier for the evaluation

model_id
string
required

ID of the model being evaluated

status
string
required

Current status of the evaluation

benchmark_id
string
required

Benchmark ID used

judge_provider
string
required

Judge model provider used

created_at
string
required

ISO timestamp of evaluation creation

message
string
required

Status message