[1] Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11):3212–3232, 2019.
[2] Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276, 2023.
[3] Chhavi Rana et al. Artificial intelligence based object detection and traffic prediction by autonomous vehicles—a review. Expert Systems with Applications, 255:124664, 2024.
[4] Zohaib Khan, Yue Shen, and Hui Liu. Object detection in agriculture: A comprehensive review of methods, applications, challenges, and future directions. Agriculture, 15(13):1351, 2025.
[5] Ranjan Sapkota, Marco Flores-Calero, Rizwan Qureshi, Chetan Badgujar, Upesh Nepal, Alwin Poulose, Peter Zeno, Uday Bhanu Prakash Vaddevolu, Sheheryar Khan, Maged Shoman, et al. Yolo advances to its genesis: a decadal and comprehensive review of the you only look once (yolo) series. Artificial Intelligence Review, 58(9):274, 2025.
[6] Ranjan Sapkota, Rahul Harsha Cheppally, Ajay Sharda, and Manoj Karkee. Rf-detr object detection vs yolov12: A study of transformer-based and cnn-based architectures for single-class and multi-class greenfruit detection in complex orchard environments under label ambiguity. arXiv preprint arXiv:2504.13099, 2025.
[7] Ranjan Sapkota, Awood Ahmed, and Manoj Karkee. Comparative analysis of yolov8 and mask r-cnn for instance segmentation in complex orchard environments. Artificial Intelligence in Agriculture, 13:84-99, 2024.
[8] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779-788, 2016.
[9] Joseph Redmon and Ali Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
[10] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
[11] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
[12] Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, et al. Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, 2022.
[13] Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475, 2023.
[14] Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn using programmable gradient information. In European conference on computer vision, pages 1–21. Springer, 2024.
[15] Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, et al. Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems, 37:107984–108011, 2024.
[16] Yunjie Tian, Qixiang Ye, and David Doermann. Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524, 2025.
[17] Mengqi Lei, Siqi Li, Yihong Wu, Han Hu, You Zhou, Xinhu Zheng, Guiguang Ding, Shaoyi Du, Zongze Wu, and Yue Gao. Yolov13: Real-time object detection with hypergraph-enhanced adaptive visual perception. arXiv preprint arXiv:2506.17733, 2025.
[18] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
[19] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and pattern intelligence, 39(6):1137–1149, 2016.
[20] Tausif Diwan, G Anirudh, and Jitendra V Tembhurne. Object detection using yolo: challenges, architectural successors, datasets and applications. Multimedia Tools and Applications, 82(6):9243–9275, 2023.
[21] Momina Liaqat Ali and Zhou Zhang. The yolo framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers, 13(12):336, 2024.
[22] Kyriakos D Apostolidis and George A Papakostas. Delving into yolo object detection models: Insights into adversarial robustness. Electronics, 14(8):1624, 2025.
[23] Enerst Edozie, Aliyu Nuhu Shuaibu, Ukagwu Kelechi John, and Bashir Olaniyi Sadiq. Comprehensive review of recent developments in visual object detection based on deep learning. Artificial Intelligence Review, 58(9):277, 2025.
[24] Mupparaju Sohan, Thotakura Sai Ram, and Ch Venkata Rami Reddy. A review on yolov8 and its advancements. In International Conference on Data Intelligence and Cognitive Informatics, pages 529–545. Springer, 2024.
[25] J Javaria Farooq, Muhammad Muaz, Khurram Khan Jadoon, Nayyer Aafaq, and Muhammad Khizer Ali Khan. An improved yolov8 for foreign object debris detection with optimized architecture for small objects. Multimedia Tools and Applications, 83(21):60921-60947, 2024.
[26] Maria Trigka and Elias Dritsas. A comprehensive survey of machine learning techniques and models for object detection. Sensors, 25(1):214, 2025.
[27] Md Tanzib Hosain, Asif Zaman, Mushfiqur Rahman Abir, Shanjida Akter, Sawon Mursalin, and Shadman Sakeeb Khan. Synchronizing object detection: Applications, advancements and existing challenges. IEEE access, 12:54129–54167, 2024.
[28] Ambati Pravallika, Mohammad Farukh Hashmi, and Aditya Gupta. Deep learning frontiers in 3d object detection: a comprehensive review for autonomous driving. IEEE Access, 2024.
[29] Jiawei Tian, Seungho Lee, and Kyungtae Kang. Faster r-cnn in healthcare and disease detection: A comprehensive review. In 2025 International Conference on Electronics, Information, and Communication (ICEIC), pages 1–6. IEEE, 2025.
[30] Peng Fu and Jiyang Wang. Lithology identification based on improved faster r-cnn. Minerals, 14(9):954, 2024.
[31] Samiyaa Yaseen Mohammed. Architecture review: Two-stage and one-stage object detection. Franklin Open, page 100322, 2025.
[32] Richard Johnson. YOLO Object Detection Explained: Definitive Reference for Developers and Engineers. HiTeX Press, 2025.
[33] Daniel Pestana, Pedro R Miranda, João D Lopes, Rui P Duarte, Mário P Véstias, Horacio C Neto, and José T De Sousa. A full featured configurable accelerator for object detection with yolo. IEEE Access, 9:75864–75877, 2021.
[34] Duy Thanh Nguyen, Tuan Nghia Nguyen, Hyun Kim, and Hyuk-Jae Lee. A high-throughput and power-efficient fpga implementation of yolo cnn for object detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(8):1861–1873, 2019.
[35] Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, and Yun Liang. Req-yolo: A resource-aware, efficient quantization framework for object detection on fpgas. In proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays, pages 33–42, 2019.
[36] Patricia Citranegara Kusuma and Benfano Soewito. Multi-object detection using yolov7 object detection algorithm on mobile device. Journal of Applied Engineering and Technological Science (JAETS), 5(1):305–320, 2023.
[37] Nico Surantha and Nana Sutisna. Key considerations for real-time object recognition on edge computing devices. Applied Sciences, 15(13):7533, 2025.
[38] Kareemah Abdulhaq and Abdussalam Ali Ahmed. Real-time object detection and recognition in embedded systems using open-source computer vision frameworks. Int. J. Electr. Eng. and Sustain., pages 103–118, 2025.
[39] Sabir Hossain and Deok-Jin Lee. Deep learning based real-time multiple-object detection and tracking on aerial imagery via a flying robot with gpu-based embedded devices. Sensors, 19(15):3371, 2019.
[40] Arief Setyanto, Theopilus Bayu Sasongko, Muhammad Ainul Fikri, and In Kee Kim. Near-edge computing aware object detection: A review. IEEE Access, 12:2989-3011, 2023.
[41] Shuo Wang, Chunlong Xia, Feng Lv, and Yifeng Shi. Rt-detrv3: Real-time end-to-end object detection with hierarchical dense positive supervision. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1628-1636. IEEE, 2025.
[42] Andrea Bonci, Pangcheng David Cen Cheng, Marina Indri, Giacomo Nabissi, and Fiorella Sibona. Human-robot perception in industrial environments: A survey. Sensors, 21(5):1571, 2021.
[43] Ranjan Sapkota and Manoj Karkee. Object detection with multimodal large vision-language models: An in-depth review. Information Fusion, 126:103575, 2026.
[44] Peng Tang, Chetan Ramaiah, Yan Wang, Ran Xu, and Caiming Xiong. Proposal learning for semi-supervised object detection. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2291–2301, 2021.
[45] Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, and Tomas Pfister. A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020.
[46] Gabriel Huang, Issam Laradji, David Vazquez, Simon Lacoste-Julien, and Pau Rodriguez. A survey of self-supervised and few-shot object detection. IEEE Transactions on Pattern Analysis and Pattern Intelligence, 45(4):4071–4089, 2022.
[47] Veenu Rani, Syed Tufael Nabi, Munish Kumar, Ajay Mittal, and Krishan Kumar. Self-supervised learning: A succinct review. Archives of Computational Methods in Engineering, 30(4):2761–2775, 2023.
[48] Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, and Peter Vajda. Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7581–7590, 2022.
[49] Mengde Xu, Zheng Zhang, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, and Zicheng Liu. End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3060-3069, 2021.
[50] Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao, Rongrong Fu, Qiang Xu, and Rongrong Ji. Active teacher for semi-supervised object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14482–14491, 2022.
[51] Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, and Shanshan Zhang. Pseco: Pseudo labeling and consistency training for semi-supervised object detection. In European Conference on Computer Vision, pages 457–472. Springer, 2022.
[52] Benjamin Caine, Rebecca Roelofs, Vijay Vasudevan, Jiquan Ngiam, Yuning Chai, Zhifeng Chen, and Jonathon Shlens. Pseudo-labeling for scalable 3d object detection. arXiv preprint arXiv:2103.02093, 2021.
[53] Longlong Jing and Yingli Tian. Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and pattern intelligence, 43(11):4037–4058, 2020.
[54] Ming Kang, Chee-Ming Ting, Fung Fung Ting, and Raphael C-W Phan. Asf-yolo: A novel yolo model with attentional scale sequence fusion for cell instance segmentation. Image and Vision Computing, 147:105057, 2024.
[55] Ajantha Vijayakumar and Subramaniyaswamy Vairavasundaram. Yolo-based object detection models: A review and its applications. Multimedia Tools and Applications, 83(35):83535–83574, 2024.