AI Vision Model Rankings

Updated 5 minutes ago

Explore top-performing models across computer vision tasks. Compare accuracy, speed, and user votes to find the best AI models.

Votes power rankings.

Overall Model Rankings

Average performance across all supported vision tasks

RankModelScoreTasksAvg Latency
1
Seg Preview
136915.40 s
2
Google
Gemini 2.5 Flash
124945.34 s
3
Google
Gemini 2.5 Pro
1249416.52 s
4
YOLO World
123612.89 s
5
Google
Gemini 2.0 Flash Exp
121453.64 s

Object Detection Model Rankings

Models that detect and localize objects in images.

RankModelScoreTasksAvg Latency
1
Seg Preview
136915.40 s
2
Google
Gemini 2.5 Flash
132049.16 s
3
Google
Gemini 2.5 Pro
1312416.77 s
4
Azure
Florence-2
125234.22 s
5
YOLO World
123612.89 s

Classification Model Rankings

Models that classify images into categories.

RankModelScoreTasksAvg Latency
1
Anthropic
Claude 3 Opus
125843.00 s
2
Google
Gemini 2.0 Flash Exp
122054.18 s
3
Google
Gemini 2.5 Flash
121345.05 s
4
Anthropic
Claude 3.7 Sonnet
121054.47 s
5
Google
Gemini 2.5 Flash Lite
120343.02 s

OCR Model Rankings

Models that extract text from images.

RankModelScoreTasksAvg Latency
1
Google
Gemini 2.5 Flash Lite
123942.89 s
2
OpenAI
GPT-4o mini
122937.14 s
3
Mistral
Mistral Medium 3.1
1224315.02 s
4
Anthropic
Claude 4 Opus
122356.31 s
5
Anthropic
Claude 3 Haiku
122252.39 s

Captioning Model Rankings

Models that generate descriptive captions for images.

RankModelScoreTasksAvg Latency
1
Google
Gemini 2.5 Pro
1236419.57 s
2
Google
Gemma 3 4B
122435.55 s
3
Anthropic
Claude 3.7 Sonnet
122457.97 s
4
Meta
Llama 3.2 Vision 11b
1222411.04 s
5
Google
Gemma 3 12B
121237.88 s

Open Prompt Model Rankings

Models that interpret free-form prompts on images.

RankModelScoreTasksAvg Latency
1
Google
Gemini 2.5 Flash
124443.92 s
2
Google
Gemini 2.5 Pro
1235415.21 s
3
Anthropic
Claude 3 Opus
123445.13 s
4
Meta
Llama 3.2 Vision 90b
121545.56 s
5
Meta
Llama 4 Maverick
121233.56 s