Google

Google: EfficientDet

EfficientDet Overview

EfficientDet is an object detection model developed by Google Research, released in November 2019. It introduces a compound scaling method that uniformly scales the resolution, depth, and width of the detection network, building on the EfficientNet backbone and a bidirectional feature pyramid network (BiFPN) for multi-scale feature fusion. This design achieves strong accuracy-efficiency tradeoffs across a family of models ranging from EfficientDet-D0 to D7.

EfficientDet-D7 achieves 55.1% AP on COCO while remaining significantly smaller in parameter count than comparable models at the time of release. The model family is well suited for deployment scenarios where compute budget varies, as smaller variants can run on edge hardware while larger variants are competitive with heavier architectures on server-side inference.

EfficientDet Details & Performance

Vision Tasks

Object Detection

Features

Usage

Past 30 Days

Not available

Not in Playground

Performance

Avg. Latency

Arena Rankings

Not yet ranked in arena

Alternatives to EfficientDet

Other models worth comparing for similar use cases.

YOLOv8
YOLOv8 is an object detection and multi-task vision model developed by Ultralytics, released in January 2023 under the AGPL-3.0 license. It succeeds YOLOv5 and introduces an anchor-free detection head, a new C2f module for improved gradient flow, and a decoupled head that separates classification and regression tasks. These changes improve both accuracy and training efficiency compared to earlier Ultralytics models.YOLOv8 supports object detection, instance segmentation, image classification, pose estimation, and oriented bounding box detection within a unified codebase. It is available in five sizes from Nano to Extra Large and exports to ONNX, TensorRT, CoreML, and other formats. YOLOv8 is one of the most widely adopted detection models in production and is directly supported by Roboflow Inference for custom model training and deployment.
YOLO11
YOLO11 is an object detection and multi-task vision model developed by Ultralytics, released in September 2024 under the AGPL-3.0 license. It is the latest generation in the Ultralytics YOLO series and supports object detection, instance segmentation, image classification, pose estimation, and oriented bounding box detection within a single unified framework. YOLO11 introduces architectural refinements that improve accuracy while reducing parameter count compared to YOLOv8 at equivalent model sizes.YOLO11 is available in five model sizes from Nano to Extra Large and is deployable through the Ultralytics Python package, Roboflow Inference, and export formats including ONNX, TensorRT, and CoreML. It supports fine-tuning on custom datasets through the standard Ultralytics training API.
Baidu
RT-DETR
RT-DETR (Real-Time Detection Transformer) is an object detection model developed by Baidu, released in April 2023 under the Apache 2.0 license. It is the first transformer-based real-time object detector, addressing the inference speed limitations of earlier DETR models through an efficient hybrid encoder that decouples intra-scale interaction and cross-scale fusion, enabling the model to process multi-scale features without the high computational overhead of standard transformer encoders.RT-DETR achieves 53.1% AP on COCO at 108 FPS on an NVIDIA T4 GPU for the RT-DETR-L variant, outperforming comparably sized YOLO detectors at similar speeds. It maintains end-to-end inference without non-maximum suppression, simplifying deployment pipelines. RT-DETR established the baseline for real-time transformer detection and has been extended by subsequent works including RF-DETR and RT-DETRv2.
RF-DETR
RF-DETR is a real-time transformer-based object detection model developed by Roboflow, with code and weights first released in March 2025 under the Apache 2.0 license. It is the first real-time model to exceed 60 AP on the Microsoft COCO benchmark, built on a DINOv2 vision transformer backbone with weight-sharing neural architecture search used to identify accuracy-latency trade-offs. The full family spans six sizes from Nano (30.5M parameters, 384×384 input) to 2XL (126.9M parameters, 880×880 input), with the accompanying research paper accepted to ICLR 2026.RF-DETR is designed for strong domain adaptability, achieving state-of-the-art performance on RF100-VL, a benchmark measuring generalization to real-world object detection tasks across diverse domains. It is deployable through Roboflow Inference and supports fine-tuning on custom datasets, making it well suited for domain-specific applications with limited training data.
D-FINE
D-FINE is a real-time object detection model introduced in October 2024 by researchers at the University of Science and Technology of China. It builds on the DETR family of transformer-based detectors by reformulating bounding box regression as a Fine-grained Distribution Refinement task. Rather than predicting box coordinates directly, D-FINE iteratively refines probability distributions over coordinate offsets across decoder layers, which provides finer localization granularity without adding inference cost. The architecture also replaces the encoder's CSP blocks with GELAN modules and inserts a Target Gating Layer after the decoder's cross-attention to reduce representational entanglement across queries. A second contribution, Global Optimal Localization Self-Distillation, transfers localization knowledge from refined deeper-layer predictions back to earlier decoder layers through internal self-distillation.D-FINE is released in five model sizes (Nano, Small, Medium, Large, and X), with D-FINE-L achieving 54.0% AP on the Microsoft COCO benchmark at 124 FPS on an NVIDIA T4 GPU, and D-FINE-X reaching 55.8% AP at 78 FPS. Pretraining on the Objects365 dataset further improves accuracy to 57.1% AP for the L variant and 59.3% AP for the X variant. The paper was accepted at ICLR 2025 as a Spotlight. Code and pretrained weights are released under the Apache 2.0 license, making the model suitable for commercial use.
Google
MobileNet SSD v2
MobileNet SSD v2 is a lightweight object detection model developed by Google Research, released in January 2018. It combines the MobileNetV2 backbone with the Single Shot MultiBox Detector (SSD) framework to produce a model optimized for inference on mobile and edge devices. MobileNetV2 introduces inverted residuals and linear bottlenecks to reduce computation while maintaining representational capacity compared to its predecessor.MobileNet SSD v2 is designed for real-time on-device detection, making it suitable for mobile apps, embedded systems, and IoT devices. It performs object detection across a fixed set of categories and can be fine-tuned on custom datasets. It trades peak accuracy for reduced inference cost and model size relative to larger two-stage detectors.

EfficientDet License

Apache 2.0

License terms and commercial-use guidance for EfficientDet.

License information is provided as a guide and is not legal advice.