Reports & Baselines

Every AXIS run produces a report with full scoring breakdowns and interaction transcripts. Baselines let you snapshot scores and detect regressions over time.

Understanding Reports

Every run automatically saves a report to .axis/reports/. Each report is a directory containing a manifest and per-scenario result files.

.axis/reports/{reportId}/
  report.json                          # Manifest with summary + metadata
  report.html                          # Visual report (after scoring)
  scenarios/{key}/{agent}.json         # Full result with transcript + scores
  scenarios/{key}/{agent}.raw.ndjson   # Raw agent stdout
  scenarios/{key}/{agent}.sparse-index.txt  # Compressed transcript for scoring

The manifest

report.json contains the run metadata and a summary of every scenario/agent result: the composite AXIS Result, per-dimension scores, token usage, duration, and any error messages. This is the file you read when scripting against AXIS output.

Scenario files

Each {agent}.json file under scenarios/ contains the full result for one scenario/agent combination: the complete interaction transcript, judge evaluations, per-interaction signal scores, and the rubric assessment.

Viewing Reports

# List all reports
npx @netlify/axis reports

# View the latest report summary
npx @netlify/axis reports latest

# View a specific scenario detail
npx @netlify/axis reports latest hello-world

# Filter by agent
npx @netlify/axis reports latest --agent claude-code

HTML reports

Open the visual report in your browser for the richest view:

npx @netlify/axis reports latest --html

The HTML report includes:

JSON output

For scripting and CI integration, use --json to get machine-readable output:

npx @netlify/axis reports latest --json

Baselines

Baselines snapshot your scores at a point in time. You compare future runs against a baseline to detect regressions -scores that dropped by more than the noise tolerance (1 point).

Setting a baseline

# Save from the latest report
npx @netlify/axis baseline set

# Save with a name (for multiple baselines)
npx @netlify/axis baseline set v1.0

# Save from a specific report
npx @netlify/axis baseline set --from 20260415-143022

Comparing against a baseline

# Compare during a run (automatic)
npx @netlify/axis run --compare-baseline

# Compare explicitly after a run
npx @netlify/axis baseline compare

# Compare against a named baseline
npx @netlify/axis baseline compare v1.0

The comparison shows deltas for each score. Score changes within the noise tolerance (0 to 1 point) are reported as unchanged. Regressions are highlighted and the command exits with code 1 if any are detected.

When to set baselines

Managing baselines

# List all baselines
npx @netlify/axis baseline list

# View baseline contents
npx @netlify/axis baseline show

# Delete a baseline
npx @netlify/axis baseline delete v1.0

CI Integration

AXIS is designed to run in CI environments. The key patterns:

GitHub Actions example

# GitHub Actions example
- name: Run AXIS tests
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
  run: npx @netlify/axis run --json --compare-baseline

Report Storage

Baselines are stored in .axis/baselines/ and designed to be checked into version control so your team shares the same regression thresholds.

Reports and cached skills should not be committed:

# .gitignore
.axis/reports/
.axis/skills-cache/