Captioning Model Rankings
Updated 4 minutes agoVotes power rankings.
Top Model Scores
ELO ratings for the highest performing models
Loading chart...
Performance vs Accuracy
ELO score vs average latency • Better models are top-left
Loading chart...
1 | multimodal | 1278 | 5 | 21.10 s | ||
2 | multimodal | 1257 | 5 | 7.49 s | Anthropic | |
3 | multimodal | 1225 | 5 | 10.82 s | ||
4 | multimodal | 1224 | 3 | 9.92 s | ||
5 | multimodal | 1223 | 3 | 9.14 s | ||
6 | multimodal | 1222 | 5 | 4.23 s | ||
7 | multimodal | 1212 | 5 | 12.18 s | Anthropic | |
8 | multimodal | 1212 | 3 | 6.74 s | ||
9 | multimodal | 1212 | 5 | 17.31 s | Anthropic | |
10 | multimodal | 1211 | 3 | 17.62 s | Qwen | |
11 | multimodal | 1211 | 3 | 1.94 s | Meta | |
12 | multimodal | 1209 | 3 | 6.06 s | OpenAI | |
13 | multimodal | 1203 | 3 | 14.47 s | xAI | |
14 | multimodal | 1200 | 4 | 6.21 s | Meta | |
15 | multimodal | 1200 | 3 | 5.23 s | Mistral | |
16 | multimodal | 1200 | 3 | 15.05 s | Mistral | |
17 | multimodal | 1200 | 5 | 12.13 s | Anthropic | |
18 | multimodal | 1200 | 3 | 5.82 s | OpenAI | |
19 | multimodal | 1200 | 3 | 5.87 s | OpenAI | |
20 | multimodal | 1196 | 4 | 13.78 s | Meta | |
21 | multimodal | 1191 | 3 | 18.65 s | OpenAI | |
22 | multimodal | 1190 | 5 | 9.72 s | Anthropic | |
23 | multimodal | 1190 | 4 | 2.15 s | ||
24 | multimodal | 1190 | 3 | 4.09 s | Mistral | |
25 | multimodal | 1187 | 3 | 9.45 s | OpenAI | |
26 | multimodal | 1176 | 3 | 6.42 s | OpenAI | |
27 | multimodal | 1175 | 3 | 8.73 s | Anthropic | |
28 | multimodal | 1175 | 3 | 19.40 s | OpenAI | |
29 | multimodal | 1167 | 3 | 4.09 s | Meta | |
30 | vlm | 1155 | 3 | 7.03 s | Microsoft | |
31 | vlm | 1154 | 3 | 6.34 s | Qwen | |
32 | multimodal | 1146 | 5 | 3.16 s | Anthropic |