ResNet-50 is a deep convolutional neural network architecture introduced in the 2015 paper "Deep Residual Learning for Image Recognition" by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun at Microsoft Research. It is part of the ResNet (Residual Network) family, which introduced residual connections — shortcut paths that allow gradients to bypass layers during training — solving the degradation problem that had previously limited the practical training of very deep networks. ResNet-50 specifically refers to a 50-layer variant with approximately 25.6 million parameters, structured as a sequence of bottleneck residual blocks consisting of 1×1, 3×3, and 1×1 convolutions.
ResNet-50 was trained on the ImageNet classification benchmark and achieved leading top-1 accuracy at release. Beyond classification, it became a widely used backbone feature extractor for downstream tasks including object detection (as the base network in Faster R-CNN, Mask R-CNN, and RetinaNet) and semantic and instance segmentation. Most current implementations in PyTorch torchvision, TensorFlow, and NVIDIA NGC use the ResNet-50 v1.5 variant, which relocates the stride-2 downsampling from the first 1×1 convolution to the 3×3 convolution within each bottleneck block, yielding approximately 0.5% higher top-1 accuracy than the original v1 formulation at a small throughput cost. ResNet-50 remains a common reference architecture in computer vision benchmarks and a standard backbone choice in detection and segmentation frameworks. The original Microsoft Research code is released under the MIT license.
Other models worth comparing for similar use cases.
License terms and commercial-use guidance for ResNet-50.
License information is provided as a guide and is not legal advice.