← Back to cards
analysisA

Benchmark Agent Review

/benchmark-agent-review

Review persisted benchmark run outputs with one or more agent graders and report subjective ergonomic quality separately from deterministic benchmark scores

v0.2

Benchmark Agent Review

Review persisted benchmark run outputs with one or more agent graders and report subjective ergonomic quality separately from deterministic benchmark scores

Platformclaude
Scopepack
Packagentic-skills-bench
Versionv0.2
agentagentic-skills-benchanalysisbenchmarkclaudepack

Benchmark Agent Review

A
/benchmark-agent-review

Review persisted benchmark run outputs with one or more agent graders and report subjective ergonomic quality separately from deterministic benchmark scores

Type
analysis
Platform
claude
Scope
pack
Version
v0.2
Pack
agentic-skills-bench
agentagentic-skills-benchanalysisbenchmarkclaudepackreview

Benchmark

AgentPass rateCost / run
claude100.0% (3/3)$1.00
codex100.0% (3/3)$1.00

Part of decks

Not part of any deck.