Google

Google: MobileNet SSD v2

MobileNet SSD v2 Overview

MobileNet SSD v2 is a lightweight object detection model developed by Google Research, released in January 2018. It combines the MobileNetV2 backbone with the Single Shot MultiBox Detector (SSD) framework to produce a model optimized for inference on mobile and edge devices. MobileNetV2 introduces inverted residuals and linear bottlenecks to reduce computation while maintaining representational capacity compared to its predecessor.

MobileNet SSD v2 is designed for real-time on-device detection, making it suitable for mobile apps, embedded systems, and IoT devices. It performs object detection across a fixed set of categories and can be fine-tuned on custom datasets. It trades peak accuracy for reduced inference cost and model size relative to larger two-stage detectors.

MobileNet SSD v2 Details & Performance

Details

Vision Tasks

Object Detection

Features

Real-Time Vision

Usage

Past 30 Days

Not available

Not in Playground

Performance

Avg. Latency

Arena Rankings

Not yet ranked in arena

Alternatives to MobileNet SSD v2

Other models worth comparing for similar use cases.

YOLOv5
YOLOv5 is an object detection model developed by Ultralytics, released in June 2020 under the AGPL-3.0 license. It is implemented in PyTorch and introduced a more accessible and well-documented YOLO implementation compared to earlier Darknet-based versions, with an integrated training and export pipeline supporting a wide range of deployment targets. YOLOv5 uses a CSP backbone, PANet neck, and a single-stage detection head with anchor-based regression.YOLOv5 is available in five sizes from Nano to Extra Large and supports export to ONNX, TensorRT, CoreML, and other formats. It is one of the most widely deployed object detection models in production environments and remains a common starting point for custom detection model training due to its documentation, community support, and compatibility with Roboflow Inference.
YOLOv8
YOLOv8 is an object detection and multi-task vision model developed by Ultralytics, released in January 2023 under the AGPL-3.0 license. It succeeds YOLOv5 and introduces an anchor-free detection head, a new C2f module for improved gradient flow, and a decoupled head that separates classification and regression tasks. These changes improve both accuracy and training efficiency compared to earlier Ultralytics models.YOLOv8 supports object detection, instance segmentation, image classification, pose estimation, and oriented bounding box detection within a unified codebase. It is available in five sizes from Nano to Extra Large and exports to ONNX, TensorRT, CoreML, and other formats. YOLOv8 is one of the most widely adopted detection models in production and is directly supported by Roboflow Inference for custom model training and deployment.
YOLO11
YOLO11 is an object detection and multi-task vision model developed by Ultralytics, released in September 2024 under the AGPL-3.0 license. It is the latest generation in the Ultralytics YOLO series and supports object detection, instance segmentation, image classification, pose estimation, and oriented bounding box detection within a single unified framework. YOLO11 introduces architectural refinements that improve accuracy while reducing parameter count compared to YOLOv8 at equivalent model sizes.YOLO11 is available in five model sizes from Nano to Extra Large and is deployable through the Ultralytics Python package, Roboflow Inference, and export formats including ONNX, TensorRT, and CoreML. It supports fine-tuning on custom datasets through the standard Ultralytics training API.
Google
EfficientDet
EfficientDet is an object detection model developed by Google Research, released in November 2019. It introduces a compound scaling method that uniformly scales the resolution, depth, and width of the detection network, building on the EfficientNet backbone and a bidirectional feature pyramid network (BiFPN) for multi-scale feature fusion. This design achieves strong accuracy-efficiency tradeoffs across a family of models ranging from EfficientDet-D0 to D7.EfficientDet-D7 achieves 55.1% AP on COCO while remaining significantly smaller in parameter count than comparable models at the time of release. The model family is well suited for deployment scenarios where compute budget varies, as smaller variants can run on edge hardware while larger variants are competitive with heavier architectures on server-side inference.
Baidu
RT-DETR
RT-DETR (Real-Time Detection Transformer) is an object detection model developed by Baidu, released in April 2023 under the Apache 2.0 license. It is the first transformer-based real-time object detector, addressing the inference speed limitations of earlier DETR models through an efficient hybrid encoder that decouples intra-scale interaction and cross-scale fusion, enabling the model to process multi-scale features without the high computational overhead of standard transformer encoders.RT-DETR achieves 53.1% AP on COCO at 108 FPS on an NVIDIA T4 GPU for the RT-DETR-L variant, outperforming comparably sized YOLO detectors at similar speeds. It maintains end-to-end inference without non-maximum suppression, simplifying deployment pipelines. RT-DETR established the baseline for real-time transformer detection and has been extended by subsequent works including RF-DETR and RT-DETRv2.
Azure
Faster R-CNN
Faster R-CNN is an object detection model introduced by Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun at Microsoft Research, published at NIPS in June 2015. It advances upon Fast R-CNN and R-CNN by introducing the Region Proposal Network (RPN), a fully convolutional network that shares features with the detection network and generates object proposals at negligible additional cost. This makes Faster R-CNN the first near-real-time deep learning object detector based on region proposals.Faster R-CNN achieves strong detection accuracy on PASCAL VOC and MS COCO at the time of release. It remains a widely referenced architecture in computer vision research and is available through Meta's Detectron2 framework as a maintained PyTorch implementation. It is most appropriate for offline or server-side inference tasks where accuracy is prioritized over latency, as its two-stage pipeline carries higher inference cost than single-stage detectors.

MobileNet SSD v2 License

MIT

License terms and commercial-use guidance for MobileNet SSD v2.

License information is provided as a guide and is not legal advice.