InferenceBench

voice.transcription.librispeech-clean-mini

1 entry. Pareto frontier computed on throughput_tok_per_s (higher is better) vs. ttft_p50_ms (lower is better). Rows marked P are on the frontier.

1 of 1 matching
Model Engine Hardware Quant TTFT P50 (ms) TTFT P99 (ms) Throughput (tok/s) $/M tokens J/token Power avg (W) Power peak (W) WER mean J / audio s Envelope
Systran/faster-whisper-large-v3 whisper-http unknown 1x NVIDIA RTX 4000 Ada Generation Laptop GPU fp16 11.51 40.50 0.0700 32.33 JSON