InferenceBench
GitHub
Raw JSON
Benchmark categories
51 signed envelopes across 11 categories.
Suite
Entries
code.generation.humaneval-mini
8
code.generation.mbpp-mini
2
llm.inference.chatbot-short
9
llm.mt.flores-200-mini-en-fr
5
llm.quality.arithmetic-mini
6
llm.quality.factual-mini
9
llm.quality.persona-consistency-mini
8
llm.quality.reasoning-mini
1
vision.understanding.chart-qa-mini
1
vision.understanding.ocr-mini
1
voice.transcription.librispeech-clean-mini
1