YOLOv8 C++部署：OpenCV DNN 实现 V5/V7/V8

在实时目标检测的工业落地场景中，Python 虽然开发便捷，但面对高并发、低延迟的需求时往往力不从心。尤其是部署到边缘设备或嵌入式系统时，资源占用和推理速度成为关键瓶颈。这时候，C++ 凭借其高效的内存管理和极致的性能表现，成了生产环境中的首选。

而 OpenCV 的 DNN 模块，作为轻量级推理引擎的代表，无需依赖庞大的深度学习框架（如 PyTorch 或 TensorFlow），仅通过一个 .onnx 文件就能完成模型加载与推断，特别适合对部署简洁性有要求的项目。更妙的是，它原生支持 CUDA 加速，在 NVIDIA GPU 上能轻松突破百帧大关。

本方案旨在打通从训练到部署的流程——用一套高度模块化的 C++ 代码，统一部署 YOLOv5、YOLOv7 和 YOLOv8 的 ONNX 模型，兼容 CPU 与 GPU 运行模式，实现一次封装多处复用。

架构设计：面向对象 + 统一接口

为了应对不同版本 YOLO 在输出结构上的差异（比如 YOLOv5 和 v8 输出格式相似但后处理逻辑略有不同，YOLOv7 则采用多尺度特征融合），我们采用了基类封装共性，子类实现特性的设计思路。

Yolo 基类负责通用流程：模型加载、图像预处理、结果绘制、参数配置。
Yolov5、Yolov7、Yolov8 子类各自重写 Detect() 方法，处理特有的解码逻辑。
所有模型输入均为 640×640，使用 formatToSquare 将原始图像填充为正方形以保持比例不变形。

这种设计不仅让代码结构清晰，也极大提升了可扩展性。未来若需接入 YOLO-NAS 或其他变体，只需新增子类即可。

检测结果结构体

struct Detection { 
    int class_id{0}; // 类别 ID 
    float confidence{0.0f}; // 置信度 
    cv::Rect box{}; // 检测框 
};

这个简单的结构体贯穿整个流程，最终用于可视化和业务逻辑判断。

头文件定义：yoloV8.h

#pragma once 
#include <iostream> 
#include <opencv2/opencv.hpp> 
using namespace std; 
using namespace cv; 
using namespace cv::dnn; 

// 检测结果结构体 
struct Detection { 
     class_id{}; 
     confidence{}; 
    cv::Rect box{}; 
}; 


  { 
: 
    = ; 
    ; 
    ; 
    {   / ( + (-x)); } 
    { 
         col = src.cols; 
         row = src.rows; 
         maxEdge = std::(col, row); 
        cv::Mat square = cv::Mat::(maxEdge, maxEdge, CV_8UC3); 
        src.((cv::(, , col, row))); 
         square; 
    } 
      inputWidth = ; 
      inputHeight = ; 
     modelConfidenceThreshold = ; 
     modelScoreThreshold = ; 
     modelNMSThreshold = ; 
    std::vector<std::string> classNames = { , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,  }; 
}; 

  :  Yolo { 
: 
    ; 
: 
     confThreshold = ; 
     nmsThreshold = ; 
}; 

  :  Yolo { 
: 
    ; 
: 
     confThreshold = ; 
     nmsThreshold = ; 
      strideSize = ; 
      strides[] = {, , }; 
      anchors[][] = { 
        {, , , , , }, 
        {, , , , , }, 
        {, , , , , } 
    }; 
}; 

  :  Yolo { 
: 
    ; 
: 
     confThreshold = ; 
     nmsThreshold = ; 
};

std::vector<Detection> Yolov5::Detect(cv::Mat& srcImg, cv::dnn::Net& net) { cv::Mat inputImg = formatToSquare(srcImg); cv::Mat blob; cv::dnn::blobFromImage(inputImg, blob, 1.0 / 255.0, cv::Size(inputWidth, inputHeight), cv::Scalar(), true, false); net.setInput(blob); std::vector<cv::Mat> outputs; net.forward(outputs, net.getUnconnectedOutLayersNames()); float* data = (float*)outputs[0].data; int rows = outputs[0].size[1]; // 通常为 25200 int dims = outputs[0].size[2]; // 85 float ratioX = (float)inputImg.cols / inputWidth; float ratioY = (float)inputImg.rows / inputHeight; std::vector<int> classIds; std::vector<float> confidences; std::vector<cv::Rect> boxes; for (int i = 0; i < rows; ++i) { float confidence = data[4]; if (confidence > modelConfidenceThreshold) { cv::Mat scores(1, 80, CV_32FC1, data + 5); cv::Point maxLoc; double maxScore; cv::minMaxLoc(scores, nullptr, &maxScore, nullptr, &maxLoc); if (maxScore > modelScoreThreshold) { classIds.push_back(maxLoc.x); confidences.push_back(confidence); float x = data[0], y = data[1], w = data[2], h = data[3]; int left = (int)((x - w * 0.5) * ratioX); int top = (int)((y - h * 0.5) * ratioY); int width = (int)(w * ratioX); int height = (int)(h * ratioY); boxes.emplace_back(left, top, width, height); } } data += dims; } std::vector<int> nmsIndices; cv::dnn::NMSBoxes(boxes, confidences, confThreshold, nmsThreshold, nmsIndices); std::vector<Detection> detections; for (int idx : nmsIndices) { Detection det; det.class_id = classIds[idx]; det.confidence = confidences[idx]; det.box = boxes[idx]; detections.push_back(det); } return detections; }

模型	CPU (i7-11800H)	CUDA (RTX 3060)	推荐场景
YOLOv5n	~35 FPS	~110 FPS	快速原型验证
YOLOv7-tiny	~28 FPS	~95 FPS	边缘设备部署
YOLOv8n	~30 FPS	~105 FPS	高精度工业检测

YOLOv8 C++部署：OpenCV DNN 实现 V5/V7/V8