Captioning Leaderboard

Updated 9 minutes ago
Votes power leaderboards.
Top Model Scores

ELO ratings for the highest performing models

Loading chart...
Performance vs Accuracy

ELO score vs average latency • Better models are top-left

Loading chart...
1
Google
Gemma 3 4B
1224+125.55 sGoogle
2
Google
Gemini 2.0 Flash Exp
1223+123.60 sGoogle
3
Google
Gemma 3 12B
1212+127.88 sGoogle
4
Google
Gemma 3 27B
1212+126.74 sGoogle
4
Google
Gemini 1.5 Pro
1212+125.63 sGoogle
4
Mistral
Pixtral 12B
1212+123.71 sMistral
5
Google
Gemini 1.5 Flash
1212+124.78 sGoogle
6
Meta
Llama 3.2 Vision 11b
1212+1210.38 sMeta
7
OpenAI
GPT-5 Mini
1211+118.88 sOpenAI
8
Qwen
Qwen VL Max
1211+1117.62 sQwen
9
OpenAI
GPT-5 Nano
1211+1119.00 sOpenAI
10
Meta
Llama 4 Maverick
1210-12.01 sMeta
11
Qwen
Qwen2.5-VL-7B-Instruct
1201+136.47 sQwen
12
Anthropic
Claude 4 Opus
1200-1211.59 sAnthropic
13
OpenAI
GPT-4o
120003.78 sOpenAI
13
Mistral
Mistral Small 3.1 24B
1200+05.23 sMistral
13
Anthropic
Claude 3.7 Sonnet
120008.15 sAnthropic
14
Meta
Llama 3.2 Vision 90b
120004.63 sMeta
15
Grok
Grok 2 Vision 1212
1200-135.33 sxAI
16
OpenAI
GPT-4.1 nano
1200-125.87 sOpenAI
17
Google
Gemini 2.5 Pro
1200+017.93 sGoogle
18
Mistral
Mistral Medium 3.1
1189-123.94 sMistral
19
OpenAI
GPT-4o mini
1188-124.01 sOpenAI
19
OpenAI
GPT-5 Chat
1188-124.08 sOpenAI
20
Anthropic
Claude 3.5 Sonnet
1177-127.45 sAnthropic
21
Anthropic
Claude 3 Haiku
116503.87 sAnthropic
22
Anthropic
Claude 4 Sonnet
1165-118.59 sAnthropic
23
OpenAI
GPT-5
1165-1118.74 sOpenAI