Grounding DINO vs YOLOv4

Compare Grounding DINO and YOLOv4 side-by-side.

Compare Grounding DINO vs YOLOv4 live

Run the same image across every model that supports a task and compare their outputs side-by-side.

These models don't share enough common tasks for a side-by-side demo. See the comparison table below for their capabilities.

Models in this comparison

Grounding DINO vs YOLOv4: Overview

Grounding DINO

Grounding DINO is an open-vocabulary object detection model developed by IDEA Research, released in March 2023 under the Apache 2.0 license. It extends the DINO transformer-based detector with grounded pre-training, enabling it to detect arbitrary objects described by free-form text queries rather than a fixed set of predefined categories. The model integrates a text encoder with a visual backbone through a feature fusion module that aligns language and visual representations at multiple scales.

Grounding DINO achieves strong zero-shot detection performance on COCO, LVIS, and ODinW benchmarks, and supports referring expression comprehension tasks. It is widely used as a foundation for open-vocabulary detection pipelines and as the detection backbone in systems such as Grounded-SAM. The model is particularly suited for applications requiring flexible, text-driven object localization across diverse domains.

YOLOv4

YOLOv4 is an object detection model developed by Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao at Academia Sinica, released in April 2020 via the Darknet framework. It combines a CSPDarknet53 backbone, PANet neck, and YOLOv3 detection head with a large set of training improvements — Bag of Freebies and Bag of Specials — that improve accuracy with minimal inference cost increase.

YOLOv4 achieves 43.5% AP on COCO at 65 FPS on a Tesla V100 GPU. The Darknet implementation is the original version, distinguishing it from subsequent PyTorch-based reimplementations. It remains a widely referenced detection architecture and a supported training target in Roboflow Inference.

Grounding DINO vs YOLOv4 Comparison Table

PropertyGrounding DINOYOLOv4
OrganizationIDEA ResearchAcademia Sinica
Categoryopenopen
Modalityvisionvision
Release DateMar 2023Apr 2020
Context Window
Parameters172M-341M
LicenseApache 2.0
Vision Tasks
Object Detection
Model Features
Foundation Vision
Zero-shot Detection