Skip to content

The signed envelope

Every benchmark result is captured in a single JSON document called an envelope. The envelope is the unit of trust in InferenceBench. If a result is not in an envelope, it does not exist.

Why an envelope

A bare metric like "1842 tokens per second" is meaningless without context. The envelope binds the metric to:

  • The exact model and revision
  • The exact engine and configuration hash
  • The hardware fingerprint (GPU, CPU, memory, BIOS, drivers)
  • The software provenance (image digest, pip freeze hash, git commit)
  • The dataset hash
  • The seed
  • A Sigstore signature

Anyone with the envelope can re-derive the configuration, attempt to reproduce, and verify the signature against the Sigstore transparency log.

What is in an envelope

{
  "envelope_version": "v1",
  "suite_id": "llm.inference",
  "suite_version": "1.0.0",
  "run_id": "01J7Q5C6...",
  "timestamp": "2026-11-15T12:34:56Z",
  "model": {
    "id": "meta-llama/Llama-4-Maverick",
    "revision": "abc123...",
    "provider": "vllm-local",
    "endpoint_hash": "8f9a..."
  },
  "engine": {
    "name": "vllm",
    "version": "0.7.2",
    "config_hash": "8b1a...",
    "image_digest": "sha256:..."
  },
  "quantization": {"format": "fp8", "method": "..."},
  "hardware_fingerprint": { "...": "see below" },
  "software_provenance": {
    "image_digest": "sha256:...",
    "pip_freeze_hash": "...",
    "git_commit": "..."
  },
  "dataset": {"id": "sharegpt-v3", "hash": "..."},
  "seed": 42,
  "driver_options": {"...": "..."},
  "metrics": {
    "ttft_p50_ms": 142.0,
    "ttft_p99_ms": 280.3,
    "tpot_p50_ms": 18.5,
    "throughput_tok_per_s": 1842.1,
    "goodput_at_slo": 142.3,
    "joules_per_token": 0.32
  },
  "distributions": {
    "ttft_ms": "traces.parquet#ttft_ms",
    "tpot_ms": "traces.parquet#tpot_ms"
  },
  "slo_template": "llm.standard",
  "warnings": [],
  "signature": {
    "method": "sigstore-cosign",
    "certificate": "-----BEGIN CERTIFICATE-----...",
    "rekor_log_index": 12345,
    "bundle": "..."
  }
}

Fields, top to bottom

Field Meaning
envelope_version Schema version. v1 is the current schema.
suite_id Suite identifier, e.g. llm.inference.
suite_version SemVer of the plugin that produced the result.
run_id UUIDv7. Sortable by creation time.
timestamp RFC 3339 UTC.
model Provider-prefixed id, revision, provider, endpoint hash.
engine Engine name, version, config hash, optional image digest.
quantization Quant format (fp8, nvfp4, awq-int4, ...) plus method notes.
hardware_fingerprint DMI UUID, GPUs, CPU, memory, BIOS, NUMA, driver, CUDA, NCCL, and the SHA-256 of all of it. See Hardware fingerprinting.
software_provenance OCI image digest (if containerized), pip freeze hash, git commit, nvidia-smi -q hash.
dataset Dataset id and canonical SHA-256.
seed Explicit integer used to seed every stochastic step.
driver_options Driver settings (Poisson rate, concurrency, duration).
metrics Per-suite metrics. Keys are defined by the plugin.
distributions Pointers to distribution data (parquet column or inline histogram).
slo_template SLO template id (llm.standard, voice.realtime, ...).
warnings Non-fatal warnings emitted during the run.
signature Sigstore signature bundle.

The content hash

The signature signs the SHA-256 of the canonical JSON body, excluding the signature block itself:

body = envelope.model_dump(exclude={"signature"}, mode="json")
canonical = json.dumps(body, sort_keys=True, separators=(",", ":"), default=str)
content_hash = hashlib.sha256(canonical.encode("utf-8")).hexdigest()

Any byte change in the body invalidates the signature. bench verify recomputes the content hash before checking the Sigstore signature, so a tampered envelope is rejected even if the signature would otherwise verify.

Signing modes

Mode When Identity
sigstore-cosign Production OSS use GitHub Actions OIDC token (keyless) or local browser OIDC flow
dev-key Local development and tests Ed25519 key generated with cosign generate-key-pair

The dev-key mode is for testing only. Published envelopes must be signed with sigstore-cosign.

Schema versioning

The schema is versioned via the envelope_version field. Any breaking change bumps the version and ships a migration. Every previous-version fixture must still verify after a bump.

See also