YOLOv5 模型构建源码详解：yaml 配置与 parse_model 解析 | 极客日志

PythonAI算法

YOLOv5 模型构建源码详解：yaml 配置与 parse_model 解析

综述由AI生成YOLOv5 模型基于 yaml 配置文件构建，涵盖 backbone 与 head 结构解析。parse_model 函数负责将 yaml 指令转换为 PyTorch 模块，处理卷积、C3、Concat 及 Detect 层的参数映射。训练流程通过 forward 方法调用_parse_once 进行单尺度推理，利用 save 列表管理特征层连接。源码展示了从模型初始化到检测头参数设置的完整构建逻辑。

星河入梦发布于 2024/12/25更新于 2026/6/521 浏览

前言

本文记录 YOLOv5 如何通过模型文件 yaml 搭建模型，从解析 yaml 参数用途，到 parse_model 模型构建，最后到 YOLOv5 如何使用搭建模型实现模型训练过程。

一、YOLOv5 文件说明

model/yolo.py 文件：为模型构建文件，主要为模型集成类 class Model(nn.Module)，模型 yaml 参数 (如：yolov5s.yaml) 构建 parse_model(d, ch)
model/common.py 文件：为模型模块 (或叫模型组装网络模块)

二、YOLOv5 调用模型构建位置

在 train.py 约 113 行位置，如下代码:

if pretrained: 
    with torch_distributed_zero_first(LOCAL_RANK): 
        weights = attempt_download(weights) # download if not found locally 
    ckpt = torch.load(weights, map_location=device) # load checkpoint 
    model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create 
    exclude = ['anchor'] if (cfg or hyp.get('anchors')) and not resume else [] # exclude keys

三、模型 yaml 文件解析

以 yolov5s.yaml 文件作为参考进行解析。

1. yaml 的 backbone 解读

backbone: # [from, number, module, args] 
[[ -1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2 
 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 
 [-1, , , []], 
 [, , , [, , ]], 
 [, , , []], 
 [, , , [, , ]], 
 [, , , []], 
 [, , , [, , ]], 
 [, , , []], 
 [, , , [, ]] 
]

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

c1, c2 = ch[f], args[0] if c2 != no: # if not output 
    c2 = make_divisible(c2 * gw, 8) 
    args = [c1, c2, *args[1:]] # 通过模块，更换 n 值 
    if m in [BottleneckCSP, C3, C3TR, C3Ghost]: 
        args.insert(2, n) # number of repeats 
        n = 1

class C3(nn.Module): # CSP Bottleneck with 3 convolutions 
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 
    super().__init__() 
    c_ = int(c2 * e) # hidden channels 
    self.cv1 = Conv(c1, c_, 1, 1) 
    self.cv2 = Conv(c1, c_, 1, 1) 
    self.cv3 = Conv(2 * c_, c2, 1) # act=FReLU(c2) 
    self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)]) # self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)]) 
def forward(self, x): 
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))

head: [[-1, 1, Conv, [512, 1, 1]], 
 [-1, 1, nn.Upsample, [None, 2, 'nearest']], 
 [[-1, 6], 1, Concat, [1]], # cat backbone P4 
 [-1, 3, C3, [512, False]], # 13 
 [-1, 1, Conv, [256, 1, 1]], 
 [-1, 1, nn.Upsample, [None, 2, 'nearest']], 
 [[-1, 4], 1, Concat, [1]], # cat backbone P3 
 [-1, 3, C3, [256, False]], # 17 (P3/8-small) 
 [-1, 1, Conv, [256, 3, 2]], 
 [[-1, 14], 1, Concat, [1]], # cat head P4 
 [-1, 3, C3, [512, False]], # 20 (P4/16-medium) 
 [-1, 1, Conv, [512, 3, 2]], 
 [[-1, 10], 1, Concat, [1]], # cat head P5 
 [-1, 3, C3, [1024, False]], # 23 (P5/32-large) 
 [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 
]

elif m is Detect: 
    args.append([ch[x] for x in f]) 
    if isinstance(args[1], int): # number of anchors 
        args[1] = [list(range(args[1] * 2))] * len(f)

class Model(nn.Module): 
    def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, anchors=None): # model, input channels, number of classes 
        super().__init__() 
        if isinstance(cfg, dict): 
            self.yaml = cfg # model dict 
        else: # is *.yaml 
            import yaml # for torch hub 
            self.yaml_file = Path(cfg).name 
            with open(cfg, errors='ignore') as f: 
                self.yaml = yaml.safe_load(f) # model dict 
        # Define model 
        ch = self.yaml['ch'] = self.yaml.get('ch', ch) # input channels 
        if nc and nc != self.yaml['nc']: 
            LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}") 
            self.yaml['nc'] = nc # override yaml value 
        if anchors: 
            LOGGER.info(f'Overriding model.yaml anchors with anchors={anchors}') 
            self.yaml['anchors'] = round(anchors) # override yaml value 
        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist 
        self.names = [str(i) for i in range(self.yaml['nc'])] # default names，将其对应数字转为字符串 
        self.inplace = self.yaml.get('inplace', True) # Build strides, anchors 
        # 以下为 detect 模块设置参数 
        m = self.model[-1] # Detect() 
        if isinstance(m, Detect): 
            s = 256 # 2x min stride 
            m.inplace = self.inplace # 通过给定假设输入为 torch.zeros(1, ch, s, s) 获得 stride 
            m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward 
            m.anchors /= m.stride.view(-1, 1, 1) # 变换获得每一特征层的 anchor 
            check_anchor_order(m) 
            self.stride = m.stride # [8,16,32] 
            self._initialize_biases() # only run once，为检测 detect 设置 bias 初始化 
        # Init weights, biases 
        initialize_weights(self) 
        self.info() 
        LOGGER.info('') 
    def forward(self, x, augment=False, profile=False, visualize=False): 
        if augment: 
            return self._forward_augment(x) # augmented inference, None 
        return self._forward_once(x, profile, visualize) # single-scale inference, train

self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist

return self._forward_once(x, profile, visualize) # single-scale inference, train

def _forward_once(self, x, profile=False, visualize=False): 
    y, dt = [], [] # outputs 
    for m in self.model: # 通过 m.f 确定改变 m 模块输入变量值，若为列表如 [-1,6] 一般为 cat 或 detect，一般需要给定输入什么特征 
        if m.f != -1: # if not from previous layer 
            # 若 m.f 为 [-1,6] 这种情况，则 [x if j == -1 else y[j] for j in m.f] 运行此块，该块将 -1 变成了上一层输出 x 与对应 6 的输出 
            x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers 
        if profile: 
            self._profile_one_layer(m, x, dt) 
        x = m(x) # run 
        # 通过之前 parse_model 获得 save 列表 (已赋值给 self.save)，将其 m 模块输出结果保存到 y 列表中，否则使用 none 代替位置 
        # 这里 m.i 是索引，是 yaml 每行的模块索引 
        y.append(x if m.i in self.save else None) # save output 
        if visualize: 
            feature_visualization(x, m.type, m.i, save_dir=visualize) 
    return x

def parse_model(d, ch): # model_dict, input_channels(3) 
    LOGGER.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments')) 
    anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple'] 
    na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors，获得每个特征点 anchor 数量，为 3 
    no = na * (nc + 5) # 最终预测输出数量，number of outputs = anchors * (classes + 5) # layers 
    # layers 保存 yaml 每一行处理作为一层，使用列表保存，最后输出使用 nn.Sequential(*layers) 处理作为模型层连接 
    # c2 为 yaml 每一行通道输出预定义数量，需与 width_multiple 参数共同决定 
    layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out 
    # ch 为 channel 数量，初始值为 [3] 
    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args 
        # eval 这个函数会把里面的字符串参数的引号去掉，把中间的内容当成 Python 的代码 
        # i 为每一层附带索引，相当于对 yaml 每一行的模块设置编号 
        m = eval(m) if isinstance(m, str) else m # eval strings 
        for j, a in enumerate(args): 
            try: 
                args[j] = eval(a) if isinstance(a, str) else a # eval strings 
            except NameError: 
                pass 
        n = n_ = max(round(n * gd), 1) if n > 1 else n # 获得最终深度，循环次数，depth gain 
        # 不同网络结构模块处理，同时会改变对应 c2 通道 
        if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3, C3TR, C3SPP, C3Ghost]: # 是否在设定模块内 
            c1, c2 = ch[f], args[0] if c2 != no: # if not output 
                c2 = make_divisible(c2 * gw, 8) 
                args = [c1, c2, *args[1:]] # 通过模块，更换 n 值 
                if m in [BottleneckCSP, C3, C3TR, C3Ghost]: 
                    args.insert(2, n) # number of repeats 
                    n = 1 
            elif m is nn.BatchNorm2d: 
                args = [ch[f]] 
            elif m is Concat: 
                c2 = sum([ch[x] for x in f]) # 将最后一层通道数与 cancet 通道叠加求和，如 [[-1, 6], 1, Concat, [1]] 将 -1 与第 6 通道求和 
            elif m is Detect: 
                args.append([ch[x] for x in f]) 
                if isinstance(args[1], int): # number of anchors 
                    args[1] = [list(range(args[1] * 2))] * len(f) 
            elif m is Contract: 
                c2 = ch[f] * args[0] ** 2 
            elif m is Expand: 
                c2 = ch[f] // args[0] ** 2 
            else: 
                c2 = ch[f] 
        m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module 
        t = str(m)[8:-2].replace('__main__.', '') # module type 
        np = sum([x.numel() for x in m_.parameters()]) # number params，计算参数量 
        m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params，将其赋给模型，后面 forward 会使用到 
        LOGGER.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n_, np, t, args)) # print 
        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist 
        layers.append(m_) # if i == 0: ch = [] # 删除 输入的 3 通道 
        ch.append(c2) # 保存每个模块的通道，即 yaml 的每行均保存，包含 concat 啥都保存 
    return nn.Sequential(*layers), sorted(save)

YOLOv5 模型构建源码详解：yaml 配置与 parse_model 解析

前言

一、YOLOv5 文件说明

二、YOLOv5 调用模型构建位置

三、模型 yaml 文件解析

1. yaml 的 backbone 解读

更多推荐文章

相关免费在线工具

Conv 模块参数解读

C3 模块参数解读

2. yaml 的 head 解读

Concat 模块参数解读

Detect 模块参数解读

四、模型构建整体解读

五、构建模型 parse_model 源码解读

更多推荐文章

相关免费在线工具

YOLOv5 模型构建源码详解：yaml 配置与 parse_model 解析

前言

一、YOLOv5 文件说明

二、YOLOv5 调用模型构建位置

三、模型 yaml 文件解析

1. yaml 的 backbone 解读

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

Conv 模块参数解读

C3 模块参数解读

2. yaml 的 head 解读

Concat 模块参数解读

Detect 模块参数解读

四、模型构建整体解读

五、构建模型 parse_model 源码解读

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具