Get the results of a completed evaluation
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Successful Response
Response schema for evaluation results
Unique identifier for the evaluation
ID of the model that was evaluated
Benchmark ID used
Evaluation status
Number of raw answers generated
ISO timestamp when evaluation completed
Evaluation metrics and scores (single model)
Evaluation method: 'eval' or 'eval-compare'
ID of second model (for comparison)
Base model results (comparison mode)
Eval model results (comparison mode)
Comparison results between models