Trace
The trace command provides headless trace inspection and analysis — no server or dashboard needed. All data is read from local JSONL result files.
Subcommands
Section titled “Subcommands”trace list
Section titled “trace list”Enumerate evaluation result files from .agentv/results/.
agentv trace list [--limit N] [--format json|table]Shows filename, test count, pass rate, average score, file size, and timestamp for each result file.
trace show
Section titled “trace show”Display evaluation results with trace details.
agentv trace show <result-file> [--test-id <id>] [--tree] [--format json|table]| Option | Description |
|---|---|
--test-id | Filter to a specific test ID |
--tree | Show hierarchical trace tree (requires results with output messages) |
--format, -f | Output format: table (default), json |
Tree View
Section titled “Tree View”The --tree flag renders tool call traces as a hierarchical tree:
research-question, 15.1s, 10,167 tok, $0.105├─ tools, 2.4s│ ├─ WebSearch, 2.1s│ └─ WebSearch, 1.8s├─ tavily_search, 3.5s└─ write_report, 450ms
Scores: response_quality 75% | routing_accuracy 100%Falls back to a flat summary when output messages are not present in the result file.
trace stats
Section titled “trace stats”Compute summary statistics (percentiles) across evaluation results.
agentv trace stats <result-file> [--group-by target|dataset|test-id] [--format json|table]| Option | Description |
|---|---|
--group-by, -g | Group statistics by: target, dataset, or test-id |
--format, -f | Output format: table (default), json |
Output shows mean, P50, P90, P95, and P99 for score, latency, cost, tokens, tool calls, and LLM calls.
Metric Mean P50 P90 P95 P99──────────── ────────── ────────── ────────── ────────── ──────────score 0.83 0.90 1.00 1.00 1.00latency_s 11.7 9.5 22.8 25.4 27.5cost_usd $0.077 $0.065 $0.150 $0.165 $0.177tokens_total 7,463 7,000 13,367 14,433 15,287Metrics with no data are omitted automatically.
Composability
Section titled “Composability”All commands support --format json for piping to jq:
# Find tests costing more than $0.10agentv trace show results.jsonl --format json \ | jq '[.[] | select(.trace.cost_usd > 0.10) | {test_id, score, cost: .trace.cost_usd}]'
# Compare providersagentv trace stats results.jsonl --group-by target --format json \ | jq '.groups[] | {label, score_mean: .metrics.score.mean}'Example
Section titled “Example”See examples/features/trace-analysis/ for a complete showcase with sample data.