← Back to cards
Benchmark Agent Review
A/benchmark-agent-review
Review persisted benchmark run outputs with one or more agent graders and report subjective ergonomic quality separately from deterministic benchmark scores
- Type
- analysis
- Platform
- claude
- Scope
- pack
- Version
- v0.2
- Pack
- agentic-skills-bench
agentagentic-skills-benchanalysisbenchmarkclaudepackreview
Benchmark
| Agent | Pass rate | Cost / run |
|---|---|---|
| claude | 100.0% (3/3) | $1.00 |
| codex | 100.0% (3/3) | $1.00 |
Part of decks
Not part of any deck.