前言
【魔改YOLOv5-6.x(上)】:结合轻量化网络Shufflenetv2、Mobilenetv3和Ghostnet
【魔改YOLOv5-6.x(中)】:加入ACON激活函数、CBAM和CA注意力机制、加权双向特征金字塔BiFPN
本文使用的YOLOv5版本为v6.1,对YOLOv5-6.x网络结构还不熟悉的同学,可以移步至:【YOLOv5-6.x】网络模型&源码解析
训练设置:
$ python train.py --weights --cfg yolov5s.yaml --data data/VOC2007.yaml -- hyp data/hyps/hyp.scratch-high.yaml --epochs 200 --device 0
- 实验环境为1个GTX 1080 GPU
- 数据集为VOC2007
- 超参数为hyp.scratch-low.yaml
- 训练200个epoch
- 其他参数均为源码中默认设置的数值
测试设置:
$ python val.py --weights yolov5s.pt --data data/VOC2007.yaml --img 832 --augment --half --device 0
- 使用val.py进行测试
- 使用TTA测试方法(augment)
?
模型组合
根据实验结果,作出如下选择:
- 使用Ghost模块,替换Backbone和Neck中的Conv模块与C3模块
- Backbone最后(SPPF之前)添加CA注意力机制
- Neck部分中添加一条BiFPN
具体修改模型的yaml文件如下:
Ghost+BiFPN
nc: 20
depth_multiple: 0.33
width_multiple: 0.50
anchors:
- [10,13, 16,30, 33,23]
- [30,61, 62,45, 59,119]
- [116,90, 156,198, 373,326]
backbone:
[[-1, 1, Conv, [64, 6, 2, 2]],
[-1, 1, GhostConv, [128, 3, 2]],
[-1, 3, C3Ghost, [128]],
[-1, 1, GhostConv, [256, 3, 2]],
[-1, 6, C3Ghost, [256]],
[-1, 1, GhostConv, [512, 3, 2]],
[-1, 9, C3Ghost, [512]],
[-1, 1, GhostConv, [1024, 3, 2]],
[-1, 3, C3Ghost, [1024]],
[-1, 1, SPPF, [1024, 5]],
]
head:
[[-1, 1, GhostConv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]],
[-1, 3, C3Ghost, [256, False]],
[-1, 1, GhostConv, [256, 3, 2]],
[[-1, 14, 6], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]],
[-1, 3, C3Ghost, [1024, False]],
[[17, 20, 23], 1, Detect, [nc, anchors]],
]
?
Ghost+CA
nc: 20
depth_multiple: 0.33
width_multiple: 0.50
anchors:
- [10,13, 16,30, 33,23]
- [30,61, 62,45, 59,119]
- [116,90, 156,198, 373,326]
backbone:
[[-1, 1, Conv, [64, 6, 2, 2]],
[-1, 1, GhostConv, [128, 3, 2]],
[-1, 3, C3Ghost, [128]],
[-1, 1, GhostConv, [256, 3, 2]],
[-1, 6, C3Ghost, [256]],
[-1, 1, GhostConv, [512, 3, 2]],
[-1, 9, C3Ghost, [512]],
[-1, 1, GhostConv, [1024, 3, 2]],
[-1, 3, C3Ghost, [1024]],
[-1, 1, CABlock, [1024, 32]],
[-1, 1, SPPF, [1024, 5]],
]
head:
[[-1, 1, GhostConv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]],
[-1, 3, C3Ghost, [256, False]],
[-1, 1, GhostConv, [256, 3, 2]],
[[-1, 15], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [512, 3, 2]],
[[-1, 11], 1, Concat, [1]],
[-1, 3, C3Ghost, [1024, False]],
[[18, 21, 24], 1, Detect, [nc, anchors]],
]
?
Ghost+BiFPN+CA
nc: 20
depth_multiple: 0.33
width_multiple: 0.50
anchors:
- [10,13, 16,30, 33,23]
- [30,61, 62,45, 59,119]
- [116,90, 156,198, 373,326]
backbone:
[[-1, 1, Conv, [64, 6, 2, 2]],
[-1, 1, GhostConv, [128, 3, 2]],
[-1, 3, C3Ghost, [128]],
[-1, 1, GhostConv, [256, 3, 2]],
[-1, 6, C3Ghost, [256]],
[-1, 1, GhostConv, [512, 3, 2]],
[-1, 9, C3Ghost, [512]],
[-1, 1, GhostConv, [1024, 3, 2]],
[-1, 3, C3Ghost, [1024]],
[-1, 1, CABlock, [1024, 32]],
[-1, 1, SPPF, [1024, 5]],
]
head:
[[-1, 1, GhostConv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]],
[-1, 3, C3Ghost, [256, False]],
[-1, 1, GhostConv, [256, 3, 2]],
[[-1, 15, 6], 1, Concat, [1]],
[-1, 3, C3Ghost, [512, False]],
[-1, 1, GhostConv, [512, 3, 2]],
[[-1, 11], 1, Concat, [1]],
[-1, 3, C3Ghost, [1024, False]],
[[18, 21, 24], 1, Detect, [nc, anchors]],
]
?
实验记录
$ python val.py --weights yolov5s.pt --data data/VOC2007.yaml --img 832 --augment --half --device 0
序号 | Model | mAP_50 | mAP_0.5:0.95 | params(M) | FLOPs(G) |
---|
0 | yolov5s-baseline | 70.6 | 43.2 | 7.06 | 16.0 | 1 | yolov5s-Shufflenetv2 | 60.8 | 35.5 | 3.84 | 8.1 | 2 | yolov5s-Mobilenetv3-small | 60.9 | 33.9 | 3.59 | 6.4 | 3 | yolov5s-Ghostnet | 70.2 | 43.6 | 3.73 | 8.3 | 4 | yolov5s-MetaAconC | 69.2 | 41.3 | 7.47 | 16.3 | 5 | yolov5s-BiFPN | 70.7 | 42.5 | 7.13 | 16.2 | 6 | yolov5s-CBAM | 68.7 | 40.8 | 6.46 | 14.2 | 7 | yolov5s-CA | 70.7 | 43.1 | 7.09 | 16.1 | 8 | yolov5s-Ghostnet-CA | 68.7 | 42.7 | 3.76 | 8.4 | 9 | yolov5s-Ghostnet-BiFPN | 68.5 | 42.8 | 3.80 | 8.5 | 10 | yolov5s-Ghostnet-BiFPN-CA | 69.5 | 43.4 | 3.82 | 8.5 |
?
结果分析
从实验结果来看,YOLOv5单独结合Ghostnet、CBAM和CA模块,效果都略有提升,但组合起来,提升效果并不明显,可能的原因是:
- 训练epoch太少,官方训练为300,后续可以尝试更多的epoch
- 超参数的选择,这里选择的是适配COCO2017数据集的hyp.scratch-low.yaml,对VOC2007数据集的训练会有一定的影响,后续可以直接训练COCO2017数据集
- CA模块添加的位置还可以更换(比如直接放到SPPF后面,或者放到Backbone中间的某些位置)
目前还没有想到更好的改进方法,欢迎大家前来交流,分享魔改YOLOv5的方法~
|