Skip to content

Install

InferenceBench is a Python package. The bench binary and the inferencebench alias both land in your PATH.

pip install inferencebench inferencebench-llm
bench --version
bench plugins

The 4 core packages on PyPI today: inferencebench (the CLI), inferencebench-envelope (schema + signing), inferencebench-harness (drivers + telemetry + metrics), inferencebench-llm (vLLM perf plugin). This is enough to run any LLM perf benchmark against a vLLM/SGLang endpoint and to verify any signed envelope.

If you have uv installed, this puts bench on your PATH globally without the venv dance:

uv tool install inferencebench --with inferencebench-llm

Now bench --help works from any shell, any cwd. Upgrade later with uv tool upgrade inferencebench.

With all plugins (in-progress; 4 of 12 on PyPI today)

The full plugin matrix has 12 packages: 4 core + 8 plugins/integrations. As of v0.0.2 the 4 core packages are on PyPI; the other 8 (inferencebench-quality, -mt, -code, -voice, -embeddings, -vision, -hf-publisher, -leaderboard) are pending publication on a PyPI rate-limit window and will land in v0.0.3.

Until then, the cleanest way to get the full set is to install from a clone:

git clone https://github.com/yobitelcomm/bench
cd bench
uv sync --all-packages --dev --prerelease=allow
uv run bench --help

Or inject the in-tree wheels into a uv tool install:

git clone https://github.com/yobitelcomm/bench
cd bench
for pkg in envelope harness plugins/llm-inference plugins/llm-quality plugins/llm-mt \
           plugins/code-generation plugins/voice-transcription plugins/embeddings-retrieval \
           plugins/vision-understanding integrations/hf-publisher tools/leaderboard; do
  (cd "$pkg" && uv build)
done

uv tool install inferencebench \
  --with inferencebench-llm \
  --with ./dist/inferencebench_quality-*.whl \
  --with ./dist/inferencebench_mt-*.whl \
  --with ./dist/inferencebench_code-*.whl \
  --with ./dist/inferencebench_voice-*.whl \
  --with ./dist/inferencebench_embeddings-*.whl \
  --with ./dist/inferencebench_vision-*.whl \
  --with ./dist/inferencebench_hf_publisher-*.whl \
  --with ./dist/inferencebench_leaderboard-*.whl \
  --prerelease=allow

Verify the install

bench --version
bench plugins
bench list

You should see at least llm.inference listed under plugins, and the 5 bundled benchmark specs it ships.

Optional extras

Extra Adds When you need it
[publish] inferencebench-hf-publisher (after v0.0.3) If you want bench publish --to hf
[leaderboard] inferencebench-leaderboard (after v0.0.3) If you want bench leaderboard --build or bench dashboard
[fixtures] datasets~=4.0 If you want bench fixtures fetch <hf-dataset>
[all] All of the above One-shot full-fat install
pip install 'inferencebench[fixtures]'      # works today
pip install 'inferencebench[all]'           # works after v0.0.3

Supported platforms

Platform Status
Linux x86_64 with NVIDIA H100 + driver 580.x Validated (50-envelope marathon, 2026-05-18)
Linux x86_64, other NVIDIA GPUs (RTX 4090, A100) Best-effort; SLO multipliers in bench doctor --show-slo adjust thresholds per GPU class
Linux ARM (Grace, Ampere Altra) Phase 2
macOS (Apple Silicon) Phase 2 — MLX engine adapter is in tree but unvalidated against real mlx_lm.server
Windows Not planned

Python versions

Python 3.12 or newer. Older versions are not supported.

Uninstall

pip uninstall inferencebench inferencebench-envelope inferencebench-harness inferencebench-llm
# or, if installed via uv tool:
uv tool uninstall inferencebench