YOLO26：面向实时目标检测的关键架构增强与性能基准测试 | 极客日志

PythonAI算法

YOLO26：面向实时目标检测的关键架构增强与性能基准测试

YOLO26 是 Ultralytics 发布的最新目标检测模型，专注于边缘计算和实时部署。其核心改进包括移除分布焦点损失（DFL）以简化回归，采用端到端无非极大值抑制（NMS）推理以降低延迟，引入渐进式损失平衡（ProgLoss）和小目标感知标签分配（STAL）提升训练稳定性及小目标检测能力，并使用 MuSGD 优化器加速收敛。模型支持检测、分割、姿态估计等多任务，兼容 ONNX、TensorRT 等多种导出格式，并在 NVIDIA Jetson 等边缘设备上表现优异，实现了准确性与效率的更好平衡。

漫步发布于 2026/2/8更新于 2026/7/214.6K 浏览

摘要

本研究对 Ultralytics YOLO26 进行了全面分析，重点阐述了其在实时边缘目标检测方面的关键架构增强和性能基准测试。YOLO26 于 2025 年 9 月发布，是 YOLO 家族中最新、最先进的成员，专为在边缘和低功耗设备上提供高效、准确且易于部署的解决方案而构建。本文依次详细介绍了 YOLO26 的架构创新，包括移除分布焦点损失（DFL）、采用端到端无 NMS 推理、集成渐进式损失平衡（ProgLoss）与小目标感知标签分配（STAL），以及引入用于稳定收敛的 MuSGD 优化器。除架构外，本研究将 YOLO26 定位为一个多任务框架，支持目标检测、实例分割、姿态/关键点估计、旋转目标检测和分类。我们展示了 YOLO26 在 NVIDIA Jetson Nano 和 Orin 等边缘设备上的性能基准，并将其结果与 YOLOv8、YOLOv11、YOLOv12、YOLOv13 以及基于 Transformer 的检测器进行比较。本文进一步探讨了实时部署路径、灵活的导出选项（ONNX、TensorRT、CoreML、TFLite）以及 INT8/FP16 量化。文中强调了 YOLO26 在机器人、制造和物联网等领域的实际用例，以展示其跨行业适应性。最后，讨论了关于部署效率和更广泛影响的见解，并概述了 YOLO26 及 YOLO 系列的未来发展方向。

关键词 YOLO26· 边缘 AI· 多任务目标检测· 无 NMS 推理· 小目标识别· You Only Look Once· 目标检测· MuSGD 优化器

1 引言

目标检测已成为计算机视觉中最关键的任务之一，使机器能够定位和分类图像或视频流中的多个目标[1, 2]。从自动驾驶和机器人到监控、医学影像、农业和智能制造，实时目标检测算法是人工智能（AI）应用的支柱[3, 4]。在这些算法中，You Only Look Once（YOLO）系列已成为最具影响力的实时目标检测模型系列，将准确性与前所未有的推理速度相结合[5, 6, 7, 7]。自 2016 年提出以来，YOLO 经历了多次架构修订，每一次都解决了前代模型的局限性，同时集成了神经网络设计、损失函数和部署效率方面的尖端进展[5]。YOLO26 于 2025 年 9 月发布，代表了这一演进轨迹的最新里程碑，引入了架构简化、新颖的优化器以及为低功耗设备设计的增强的边缘部署能力。

表 1 详细比较了从 YOLOv1 到 YOLOv13 以及 YOLO26 的各个 YOLO 模型，突出了它们的发布年份、关键架构创新、性能增强和开发框架。

[图 1：YOLO 模型演进对比表]

YOLO 框架由 Joseph Redmon 及其同事于 2016 年首次提出，标志着目标检测领域的范式转变[8]。与传统的两阶段检测器（如 R-CNN[18]和 Faster R-CNN[19]）将区域提议与分类分开不同，YOLO 将检测公式化为一个单一的回归问题[20]。通过在一个前向传播过程中直接预测边界框和类别概率，YOLO 在保持具有竞争力的准确性的同时实现了实时速度[21, 20]。这种效率使得 YOLOv1 在延迟是关键因素的应用中极具吸引力，包括机器人、自主导航和实时视频分析。后续版本 YOLOv2（2017）[9]和 YOLOv3（2018）[10]在保持实时性能的同时显著提高了准确性。YOLOv2 引入了批量归一化、锚框和多尺度训练，从而提高了对不同尺寸目标的鲁棒性。YOLOv3 采用了基于 Darknet-53 的更深架构以及多尺度特征图以改善小目标检测。这些增强功能使 YOLOv3 在随后的几年中成为学术界和工业应用的事实标准[22, 5, 23]。

随着对更高准确性的需求不断增长，特别是在航空影像、农业和医学分析等具有挑战性的领域，YOLO 模型发展出更多先进的架构。YOLOv4（2020）[11]引入了跨阶段部分网络（CSPNet）、改进的激活函数（如 Mish）以及先进训练策略，包括马赛克数据增强和 CIoU 损失。YOLOv5（Ultralytics, 2020）虽然非官方，但因其 PyTorch 实现、广泛的社区支持以及在多样化平台上的简化部署而获得了极大的普及。YOLOv5 还带来了模块化特性，使其更容易适应分割、分类和边缘应用。进一步的发展包括 YOLOv6[12]和 YOLOv7[13]（2022），它们集成了先进的优化技术、参数高效模块和受 Transformer 启发的模块。这些迭代将 YOLO 推向了最先进的（SoTA）准确性基准，同时保持了对实时推理的关注。至此，YOLO 生态系统已牢固确立了其在目标检测研究和部署领域的领先系列模型地位。

Ultralytics 作为现代 YOLO 版本的主要维护者，通过 YOLOv8（2023）[24]重新定义了该框架。YOLOv8 采用了解耦的检测头、无锚框预测和改进的训练策略，从而在准确性和部署灵活性方面实现了实质性提升[25]。由于其简洁的 Python API、与 TensorRT、CoreML 和 ONNX 的兼容性，以及针对速度与准确性权衡优化的变体（nano、small、medium、large 和 extra-large）的可用性，它在工业界被广泛采用。YOLOv9[14]、YOLOv10[15]和 YOLO11 相继快速发布，每一次迭代都在推动架构和性能的边界。YOLOv9 引入了 GELAN（广义高效层聚合网络）和渐进式蒸馏，将效率与更高的表征能力相结合。YOLOv10 通过混合任务对齐分配专注于平衡准确性和推理延迟。YOLOv11 进一步完善了 Ultralytics 的愿景，在 GPU 上提供了更高的效率，同时保持了强大的小目标性能[5]。这些模型共同巩固了 Ultralytics 在生产适用于现代部署流程的、即用型 YOLO 版本方面的声誉。

继 YOLO11 之后，替代版本 YOLOv12[16]和 YOLOv13[17]引入了以注意力为中心的设计和先进的架构组件，旨在最大化跨不同数据集的准确性。这些模型探索了多头自注意力、改进的多尺度融合和更强的训练正则化策略。虽然它们提供了强大的基准，但仍然依赖非极大值抑制（NMS）和分布焦点损失（DFL），这引入了延迟开销和导出挑战，特别是对于低功耗设备。基于 NMS 的后处理和复杂损失公式的局限性推动了 YOLO26（Ultralytics YOLO26 官方来源）的开发。2025 年 9 月，在伦敦举行的 YOLO Vision 2025 活动上，Ultralytics 发布了 YOLO26，作为专为边缘计算、机器人和移动 AI 优化的下一代模型。

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

[1] Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11):3212–3232, 2019.
[2] Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276, 2023.
[3] Chhavi Rana et al. Artificial intelligence based object detection and traffic prediction by autonomous vehicles—a review. Expert Systems with Applications, 255:124664, 2024.
[4] Zohaib Khan, Yue Shen, and Hui Liu. Object detection in agriculture: A comprehensive review of methods, applications, challenges, and future directions. Agriculture, 15(13):1351, 2025.
[5] Ranjan Sapkota, Marco Flores-Calero, Rizwan Qureshi, Chetan Badgujar, Upesh Nepal, Alwin Poulose, Peter Zeno, Uday Bhanu Prakash Vaddevolu, Sheheryar Khan, Maged Shoman, et al. Yolo advances to its genesis: a decadal and comprehensive review of the you only look once (yolo) series. Artificial Intelligence Review, 58(9):274, 2025.
[6] Ranjan Sapkota, Rahul Harsha Cheppally, Ajay Sharda, and Manoj Karkee. Rf-detr object detection vs yolov12: A study of transformer-based and cnn-based architectures for single-class and multi-class greenfruit detection in complex orchard environments under label ambiguity. arXiv preprint arXiv:2504.13099, 2025.
[7] Ranjan Sapkota, Awood Ahmed, and Manoj Karkee. Comparative analysis of yolov8 and mask r-cnn for instance segmentation in complex orchard environments. Artificial Intelligence in Agriculture, 13:84-99, 2024.
[8] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779-788, 2016.
[9] Joseph Redmon and Ali Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
[10] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
[11] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
[12] Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, et al. Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, 2022.
[13] Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475, 2023.
[14] Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn using programmable gradient information. In European conference on computer vision, pages 1–21. Springer, 2024.
[15] Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, et al. Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems, 37:107984–108011, 2024.
[16] Yunjie Tian, Qixiang Ye, and David Doermann. Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524, 2025.
[17] Mengqi Lei, Siqi Li, Yihong Wu, Han Hu, You Zhou, Xinhu Zheng, Guiguang Ding, Shaoyi Du, Zongze Wu, and Yue Gao. Yolov13: Real-time object detection with hypergraph-enhanced adaptive visual perception. arXiv preprint arXiv:2506.17733, 2025.
[18] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
[19] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and pattern intelligence, 39(6):1137–1149, 2016.
[20] Tausif Diwan, G Anirudh, and Jitendra V Tembhurne. Object detection using yolo: challenges, architectural successors, datasets and applications. Multimedia Tools and Applications, 82(6):9243–9275, 2023.
[21] Momina Liaqat Ali and Zhou Zhang. The yolo framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers, 13(12):336, 2024.
[22] Kyriakos D Apostolidis and George A Papakostas. Delving into yolo object detection models: Insights into adversarial robustness. Electronics, 14(8):1624, 2025.
[23] Enerst Edozie, Aliyu Nuhu Shuaibu, Ukagwu Kelechi John, and Bashir Olaniyi Sadiq. Comprehensive review of recent developments in visual object detection based on deep learning. Artificial Intelligence Review, 58(9):277, 2025.
[24] Mupparaju Sohan, Thotakura Sai Ram, and Ch Venkata Rami Reddy. A review on yolov8 and its advancements. In International Conference on Data Intelligence and Cognitive Informatics, pages 529–545. Springer, 2024.
[25] J Javaria Farooq, Muhammad Muaz, Khurram Khan Jadoon, Nayyer Aafaq, and Muhammad Khizer Ali Khan. An improved yolov8 for foreign object debris detection with optimized architecture for small objects. Multimedia Tools and Applications, 83(21):60921-60947, 2024.
[26] Maria Trigka and Elias Dritsas. A comprehensive survey of machine learning techniques and models for object detection. Sensors, 25(1):214, 2025.
[27] Md Tanzib Hosain, Asif Zaman, Mushfiqur Rahman Abir, Shanjida Akter, Sawon Mursalin, and Shadman Sakeeb Khan. Synchronizing object detection: Applications, advancements and existing challenges. IEEE access, 12:54129–54167, 2024.
[28] Ambati Pravallika, Mohammad Farukh Hashmi, and Aditya Gupta. Deep learning frontiers in 3d object detection: a comprehensive review for autonomous driving. IEEE Access, 2024.
[29] Jiawei Tian, Seungho Lee, and Kyungtae Kang. Faster r-cnn in healthcare and disease detection: A comprehensive review. In 2025 International Conference on Electronics, Information, and Communication (ICEIC), pages 1–6. IEEE, 2025.
[30] Peng Fu and Jiyang Wang. Lithology identification based on improved faster r-cnn. Minerals, 14(9):954, 2024.
[31] Samiyaa Yaseen Mohammed. Architecture review: Two-stage and one-stage object detection. Franklin Open, page 100322, 2025.
[32] Richard Johnson. YOLO Object Detection Explained: Definitive Reference for Developers and Engineers. HiTeX Press, 2025.
[33] Daniel Pestana, Pedro R Miranda, João D Lopes, Rui P Duarte, Mário P Véstias, Horacio C Neto, and José T De Sousa. A full featured configurable accelerator for object detection with yolo. IEEE Access, 9:75864–75877, 2021.
[34] Duy Thanh Nguyen, Tuan Nghia Nguyen, Hyun Kim, and Hyuk-Jae Lee. A high-throughput and power-efficient fpga implementation of yolo cnn for object detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(8):1861–1873, 2019.
[35] Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, and Yun Liang. Req-yolo: A resource-aware, efficient quantization framework for object detection on fpgas. In proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays, pages 33–42, 2019.
[36] Patricia Citranegara Kusuma and Benfano Soewito. Multi-object detection using yolov7 object detection algorithm on mobile device. Journal of Applied Engineering and Technological Science (JAETS), 5(1):305–320, 2023.
[37] Nico Surantha and Nana Sutisna. Key considerations for real-time object recognition on edge computing devices. Applied Sciences, 15(13):7533, 2025.
[38] Kareemah Abdulhaq and Abdussalam Ali Ahmed. Real-time object detection and recognition in embedded systems using open-source computer vision frameworks. Int. J. Electr. Eng. and Sustain., pages 103–118, 2025.
[39] Sabir Hossain and Deok-Jin Lee. Deep learning based real-time multiple-object detection and tracking on aerial imagery via a flying robot with gpu-based embedded devices. Sensors, 19(15):3371, 2019.
[40] Arief Setyanto, Theopilus Bayu Sasongko, Muhammad Ainul Fikri, and In Kee Kim. Near-edge computing aware object detection: A review. IEEE Access, 12:2989-3011, 2023.
[41] Shuo Wang, Chunlong Xia, Feng Lv, and Yifeng Shi. Rt-detrv3: Real-time end-to-end object detection with hierarchical dense positive supervision. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1628-1636. IEEE, 2025.
[42] Andrea Bonci, Pangcheng David Cen Cheng, Marina Indri, Giacomo Nabissi, and Fiorella Sibona. Human-robot perception in industrial environments: A survey. Sensors, 21(5):1571, 2021.
[43] Ranjan Sapkota and Manoj Karkee. Object detection with multimodal large vision-language models: An in-depth review. Information Fusion, 126:103575, 2026.
[44] Peng Tang, Chetan Ramaiah, Yan Wang, Ran Xu, and Caiming Xiong. Proposal learning for semi-supervised object detection. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2291–2301, 2021.
[45] Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, and Tomas Pfister. A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020.
[46] Gabriel Huang, Issam Laradji, David Vazquez, Simon Lacoste-Julien, and Pau Rodriguez. A survey of self-supervised and few-shot object detection. IEEE Transactions on Pattern Analysis and Pattern Intelligence, 45(4):4071–4089, 2022.
[47] Veenu Rani, Syed Tufael Nabi, Munish Kumar, Ajay Mittal, and Krishan Kumar. Self-supervised learning: A succinct review. Archives of Computational Methods in Engineering, 30(4):2761–2775, 2023.
[48] Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, and Peter Vajda. Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7581–7590, 2022.
[49] Mengde Xu, Zheng Zhang, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, and Zicheng Liu. End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3060-3069, 2021.
[50] Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao, Rongrong Fu, Qiang Xu, and Rongrong Ji. Active teacher for semi-supervised object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14482–14491, 2022.
[51] Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, and Shanshan Zhang. Pseco: Pseudo labeling and consistency training for semi-supervised object detection. In European Conference on Computer Vision, pages 457–472. Springer, 2022.
[52] Benjamin Caine, Rebecca Roelofs, Vijay Vasudevan, Jiquan Ngiam, Yuning Chai, Zhifeng Chen, and Jonathon Shlens. Pseudo-labeling for scalable 3d object detection. arXiv preprint arXiv:2103.02093, 2021.
[53] Longlong Jing and Yingli Tian. Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and pattern intelligence, 43(11):4037–4058, 2020.
[54] Ming Kang, Chee-Ming Ting, Fung Fung Ting, and Raphael C-W Phan. Asf-yolo: A novel yolo model with attentional scale sequence fusion for cell instance segmentation. Image and Vision Computing, 147:105057, 2024.
[55] Ajantha Vijayakumar and Subramaniyaswamy Vairavasundaram. Yolo-based object detection models: A review and its applications. Multimedia Tools and Applications, 83(35):83535–83574, 2024.

YOLO26：面向实时目标检测的关键架构增强与性能基准测试

摘要

1 引言

更多推荐文章

相关免费在线工具

2 YOLO26 的架构增强

2.1 移除分布焦点损失（DFL）

2.2 端到端无 NMS 推理

2.3 ProgLoss 与 STAL：增强的训练稳定性和小目标检测

2.4 用于稳定收敛的 MuSGD 优化器

3 基准测试与比较分析

4 使用 Ultralytics YOLO26 进行实时部署

4.1 灵活的导出与集成路径

4.2 量化与资源受限设备

4.3 跨行业应用：从机器人到制造

4.4 来自 YOLO26 部署的广泛见解

5 结论与未来方向

5.1 未来方向

6 致谢

参考文献

更多推荐文章

相关免费在线工具

YOLO26：面向实时目标检测的关键架构增强与性能基准测试

摘要

1 引言

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

2 YOLO26 的架构增强

2.1 移除分布焦点损失（DFL）

2.2 端到端无 NMS 推理

2.3 ProgLoss 与 STAL：增强的训练稳定性和小目标检测

2.4 用于稳定收敛的 MuSGD 优化器

3 基准测试与比较分析

4 使用 Ultralytics YOLO26 进行实时部署

4.1 灵活的导出与集成路径

4.2 量化与资源受限设备

4.3 跨行业应用：从机器人到制造

4.4 来自 YOLO26 部署的广泛见解

5 结论与未来方向

5.1 未来方向

6 致谢

参考文献

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具