Skip to content

Performance baselines

A performance target snapshots load-test metrics and compares them with Dungbeetle's numeric-tolerance engine — the same structured-snapshot-first approach applied to terminal and DOM output, now for performance. Three tool parsers ship today: k6 (the default), Apache Benchmark (ab), and autocannon; all normalize into the same metrics shape, so tolerances and diffs work identically.

Requirement

k6 must be installed and on PATH (it is not bundled, like the optional Playwright browser) — unless you snapshot a pre-exported summary with summary.

Configure a target

json
{
  "kind": "performance",
  "name": "api-load",
  "script": "perf/script.js",
  "metrics": ["http_req_duration", "http_reqs", "checks"]
}

Options:

  • script — a k6 script. Dungbeetle runs k6 run --summary-export … and snapshots the result.
  • summary — alternatively, point at an existing k6 --summary-export JSON file to snapshot without running k6 (useful in CI where k6 ran separately).
  • metrics — restrict the snapshot to selected metric names (default: all).

Snapshot model

A performance snapshot normalizes a k6 summary into a stable, diffable shape:

json
{
  "kind": "performance",
  "tool": "k6",
  "metrics": {
    "http_req_duration": { "avg": 12.346, "min": 5.1, "med": 11, "max": 45.2, "p90": 20.1, "p95": 28.4 },
    "http_reqs": { "count": 1500, "rate": 249.8 },
    "checks": { "passes": 1500, "fails": 0, "value": 1 }
  }
}
  • k6's percentile keys (p(95)) are renamed to path-friendly names (p95).
  • Values are rounded to 3 decimals for readable baselines.
  • Non-numeric stats are dropped.

Tolerances

Performance numbers vary run to run, so comparison relies on comparison.numericTolerance — set a relative tolerance rather than expecting exact equality:

json
{ "comparison": { "numericTolerance": { "absolute": 0, "relative": 0.2 } } }

A metric that moves within tolerance passes; one that regresses beyond it fails with a percentage-delta diff:

diff
~ http_req_duration.p95: 28.4 → 60 (+111.3%)

Apache Benchmark (ab)

Set tool: "ab" and give the target the benchmark command to run (or a saved output file via summary):

json
{
  "kind": "performance",
  "name": "homepage-bench",
  "tool": "ab",
  "command": "ab -n 100 -c 5 http://127.0.0.1:8000/",
  "metrics": ["requests", "total_ms", "percentiles_ms", "document"]
}

The plain-text summary normalizes into the same shape: requests (complete/failed/per_second), duration_ms, the connection-times table (connect_ms / processing_ms / waiting_ms / total_ms), percentiles_ms (p50p100), and document.length_bytes — kept on purpose, because a changed response size means the page itself changed:

diff
~ document.length_bytes: 58 → 414 (+613.8%)

ab ships with macOS and the Apache httpd-tools package on most Linux distributions, so this is often the zero-install way to put a latency baseline on an endpoint.

autocannon

Set tool: "autocannon" with a --json command (or a saved output via summary):

json
{
  "kind": "performance",
  "name": "home-load",
  "tool": "autocannon",
  "command": "npx autocannon -d 5 -c 10 --json http://127.0.0.1:8000/",
  "metrics": ["latency", "counts"]
}

The JSON summary normalizes into latency / requests / throughput stat groups (mean, stddev, min/max, full percentiles), counts (errors, timeouts, status classes — a regression here is signal even when timing holds), and run (duration, connections, pipelining — so a changed benchmark shape is itself a named diff, not silently different numbers). Timestamps never land in the snapshot.

Try it

The examples/perf example uses a committed summary.json, so it needs no k6 install:

sh
dungbeetle update --config examples/perf/dungbeetle.config.json   # write the baseline
dungbeetle test   --config examples/perf/dungbeetle.config.json   # compare

Source-available: CLI under FSL-1.1-ALv2, cloud server under BUSL-1.1. See Licensing.