Captioning Leaderboard
Updated 9 minutes agoVotes power leaderboards.
Top Model Scores
ELO ratings for the highest performing models
Loading chart...
Performance vs Accuracy
ELO score vs average latency • Better models are top-left
Loading chart...
1 | Gemma 3 4B | 1224 | +12 | 5.55 s | |
2 | Gemini 2.0 Flash Exp | 1223 | +12 | 3.60 s | |
3 | Gemma 3 12B | 1212 | +12 | 7.88 s | |
4 | Gemma 3 27B | 1212 | +12 | 6.74 s | |
4 | Gemini 1.5 Pro | 1212 | +12 | 5.63 s | |
4 | Pixtral 12B | 1212 | +12 | 3.71 s | Mistral |
5 | Gemini 1.5 Flash | 1212 | +12 | 4.78 s | |
6 | Llama 3.2 Vision 11b | 1212 | +12 | 10.38 s | Meta |
7 | GPT-5 Mini | 1211 | +11 | 8.88 s | OpenAI |
8 | Qwen VL Max | 1211 | +11 | 17.62 s | Qwen |
9 | GPT-5 Nano | 1211 | +11 | 19.00 s | OpenAI |
10 | Llama 4 Maverick | 1210 | -1 | 2.01 s | Meta |
11 | Qwen2.5-VL-7B-Instruct | 1201 | +13 | 6.47 s | Qwen |
12 | Claude 4 Opus | 1200 | -12 | 11.59 s | Anthropic |
13 | GPT-4o | 1200 | 0 | 3.78 s | OpenAI |
13 | Mistral Small 3.1 24B | 1200 | +0 | 5.23 s | Mistral |
13 | Claude 3.7 Sonnet | 1200 | 0 | 8.15 s | Anthropic |
14 | Llama 3.2 Vision 90b | 1200 | 0 | 4.63 s | Meta |
15 | Grok 2 Vision 1212 | 1200 | -13 | 5.33 s | xAI |
16 | GPT-4.1 nano | 1200 | -12 | 5.87 s | OpenAI |
17 | Gemini 2.5 Pro | 1200 | +0 | 17.93 s | |
18 | Mistral Medium 3.1 | 1189 | -12 | 3.94 s | Mistral |
19 | GPT-4o mini | 1188 | -12 | 4.01 s | OpenAI |
19 | GPT-5 Chat | 1188 | -12 | 4.08 s | OpenAI |
20 | Claude 3.5 Sonnet | 1177 | -12 | 7.45 s | Anthropic |
21 | Claude 3 Haiku | 1165 | 0 | 3.87 s | Anthropic |
22 | Claude 4 Sonnet | 1165 | -11 | 8.59 s | Anthropic |
23 | GPT-5 | 1165 | -11 | 18.74 s | OpenAI |