InferenceBench

Benchmark categories

51 signed envelopes across 11 categories.

SuiteEntries
code.generation.humaneval-mini 8
code.generation.mbpp-mini 2
llm.inference.chatbot-short 9
llm.mt.flores-200-mini-en-fr 5
llm.quality.arithmetic-mini 6
llm.quality.factual-mini 9
llm.quality.persona-consistency-mini 8
llm.quality.reasoning-mini 1
vision.understanding.chart-qa-mini 1
vision.understanding.ocr-mini 1
voice.transcription.librispeech-clean-mini 1