Pricing

Per-evaluation pricing. No subscriptions. Pay only when you benchmark.

Quick Eval

$500per agent

Fast assessment against core DREAM metrics with a standard task suite.

  • 10 evaluation scenarios
  • 5 DREAM metric scores
  • Summary report (PDF)
  • 24-hour turnaround
  • 1 agent per evaluation
Start Quick Eval
Most Popular

Standard Eval

$1,200per agent

Comprehensive evaluation with reasoning traces and domain-specific suites.

  • 25 evaluation scenarios
  • 5 DREAM metric scores + sub-metrics
  • Reasoning trace visualization
  • Domain-specific task suites
  • Side-by-side comparison (up to 3 agents)
  • Interactive dashboard + PDF + JSON
  • Improvement recommendations
Start Standard Eval

Enterprise

$2,000+per agent

Full-depth evaluation with custom scenarios, dedicated support, and API access.

  • 50+ custom scenarios
  • Custom metric weighting
  • API access for CI/CD integration
  • Unlimited agent comparisons
  • Dedicated evaluation engineer
  • Priority 12-hour turnaround
  • Custom reporting & white-label
  • SOC 2 compliant infrastructure
Contact Sales

Frequently Asked Questions

What counts as one evaluation?

One evaluation covers a single agent version run through a complete task suite. Re-runs of the same agent version with different parameters count as separate evaluations.

Can I evaluate my agent via API?

Yes. Our REST API supports programmatic agent submission, evaluation triggering, and result retrieval. API access is included in Enterprise and available as an add-on for Standard.

What frameworks do you support?

We support any agent accessible via HTTP endpoint. We have native integrations for LangChain, AutoGen, CrewAI, and custom Python agents. Bring-your-own-trace upload is also supported.

How are DREAM scores calibrated?

DREAM metrics are calibrated against expert human judgments with inter-annotator agreement κ = 0.87. We continuously validate scoring rubrics against a held-out dataset of 500+ expert-annotated research tasks.

Do you offer volume discounts?

Yes. Teams evaluating 5+ agents per month receive 20% off. Annual commitments include additional savings. Contact sales for custom pricing.