轻量化YOLOv7-tiny的遥感图像小目标检测

doi:10.6046/zrzyyg.2024102

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(5695 KB)

HTML
输出: BibTeX | EndNote (RIS)

摘要

针对遥感图像尺度变化大、场景信息复杂、小目标特征信息较少等导致的检测精度较低和当前目标检测模型参数量大、复杂性高导致的检测效率低的问题,该文提出了一种轻量化的YOLOv7-tiny遥感图像检测算法。首先,使用组混洗卷积(group shuffle convolution, GSConv)和VoV-GSCSP模块改进网络颈部,在保持足够检测精度的同时减少模型的计算量和网络结构的复杂性; 其次,在预测时采用一种结合注意力机制的动态预测头(dynamic head, DyHead),通过在尺度感知的特征层、空间感知的空间位置及任务感知的输出通道内,结合多头自注意机制,提高目标检测头的性能; 最后,利用基于Wasserstein距离的小目标检测评估方法(normalized Wasserstein distance, NWD)结合基于最小点距离的边界框回归损失函数(minimum points distance intersection over union, MPDIoU)来优化原模型的损失函数,增强对小目标检测的鲁棒性。实验结果表明,本文所提出的算法在DIOR数据集和RSOD数据集的mAP@0.5分别达到87.7%和94.7%,比原YOLOv7-tiny模型分别提高了2.7百分点和5.1百分点,且每秒检测帧率(frames per second,FPS)分别提高了12.2%和11.9%,能够有效提高遥感图像小目标检测的精度和实时性。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	徐紫窈
	杨武
	施小龙

关键词 ：遥感图像, 目标检测, YOLOv7-tiny, GSConv, MPDIoU, DyHead

Abstract：

To address the issues of low detection accuracy caused by significant scale variations, complex scenes, and limited feature information of small targets in remote sensing images, as well as low detection efficiency resulting from the large parameter size and high complexity of current object detection models, this study proposes a lightweight YOLOv7-tiny model for remote sensing image detection. First, the network neck was improved by incorporating group shuffle convolution (GSConv) and VoV-GSCSP modules. This allows for sufficient detection accuracy while reducing computational costs and network complexity. Second, a dynamic head (DyHead) combined with an attention mechanism was adopted during prediction. The performance of the detection head was enhanced using multi-head self-attention across scale-aware feature layers, spatially-aware positions, and task-aware output channels. Finally, the loss function of the original model was optimized by integrating the normalized Wasserstein distance (NWD) metric for small-target assessment and a bounding box regression loss function based on the minimum point distance IoU (MPDIoU). This assists in enhancing robustness for small target detection. The experimental results demonstrate that the proposed algorithm achieved mAP@50 scores of 87.7% and 94.7% on the DIOR and RSOD datasets, respectively, indicating increases of 2.7 and 5.1 percentage points compared to the original YOLOv7-tiny model. Furthermore, the frames per second (FPS) increased by 12.2% and 11.9%, respectively. Therefore, the proposed algorithm can effectively enhance both the accuracy and real-time performance of small target detection from remote sensing images.

Key words： remote sensing images object detection YOLOv7-tiny GSConv MPDIoU DyHead

收稿日期: 2024-03-15 出版日期: 2025-09-03

ZTFLH:

TP79

基金资助:国家自然科学基金项目“面向领域拓展的开放式目标检测算法研究”(62306053);重庆理工大学研究生创新基金项目“优化YOLOv7-tiny模型在遥感图像目标中的应用”(gzlcx20243164)

通讯作者: 杨武(1965-),男,教授,主要从事社交网络媒体分析、信息检索等方面的研究。Email: yangwu@cqut.edu.cn。

作者简介: 徐紫窈(1998-),女,硕士研究生,主要研究方向为遥感图像目标检测。Email: xuziyao1854@163.com。

引用本文:

徐紫窈, 杨武, 施小龙. 轻量化YOLOv7-tiny的遥感图像小目标检测[J]. 自然资源遥感, 2025, 37(4): 1-11.
XU Ziyao, YANG Wu, SHI Xiaolong. Small target detection in remote sensing images based on lightweight YOLOv7-tiny. Remote Sensing for Natural Resources, 2025, 37(4): 1-11.

链接本文:

https://www.gtzyyg.com/CN/10.6046/zrzyyg.2024102 或 https://www.gtzyyg.com/CN/Y2025/V37/I4/1

Fig.1 优化的YOLOv7-tiny网络结构

Fig.2 GSConv模块

Fig.3 VoV-GSCSP瓶颈单元模块与VoV-GSCSP模块

Fig.4 DyHead结构

Tab.1 DIOR及RSOD数据集信息

Fig.5 DIOR数据集

Fig.6 RSOD数据集

Fig.7 YOLOv7-tiny改进前后的精确率、召回率和mAP曲线

Tab.2 消融实验结果对比

Tab.3 不同算法在DIOR数据集上的实验结果对比

Tab.4 不同算法在RSOD数据集上的实验结果对比

Fig.8-1 所提算法与YOLOv7-tiny在DIOR数据集上检测结果对比

Fig.8-2 所提算法与YOLOv7-tiny在DIOR数据集上检测结果对比

[1]	Zou Z, Chen K, Shi Z, et al. Object detection in 20 years:A survey[J]. Proceedings of the IEEE, 2023, 111(3):257-276.
[2]	Dai J, Li Y, He K, et al. R-FCN:Object detection via region-based fully convolutional networks[J]. Advances in neural information processing systems, 2016, 29:379-387.
[3]	Zaidi S S A, Ansari M S, Aslam A, et al. A survey of modern deep learning based object detection models[J]. Digital Signal Processing, 2022, 126:103514.
[4]	付涵, 范湘涛, 严珍珍, 等. 基于深度学习的遥感图像目标检测技术研究进展[J]. 遥感技术与应用, 2022, 37(2):290-305. doi: 10.11873/j.issn.1004-0323.2022.2.0290
	Fu H, Fan X T, Yan Z Z, et al. Progress of object detection in remote sensing images based on deep learning[J]. Remote Sensing Technology and Application, 2022, 37(2):290-305.
[5]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition.June 23-28,2014,Columbus,OH,USA.IEEE, 2014: 580-587.
[6]	Girshick R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV).December 7-13, 2015: 1440-1448.
[7]	Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[8]	Redmon J, Divvala S, Girshick R, et al. You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).June 27-30,2016,Las Vegas,NV,USA.IEEE, 2016: 779-788.
[9]	Redmon J, Farhadi A. YOLO9000:Better,faster,stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2017: 6517-6525.
[10]	Li C, Li L, Jiang H, et al. YOLOv6:A single-stage object detection framework for industrial applications[J/OL]. arXiv, 2022(2022-09-07)[2024-03/12].https://doi.org/10.48550/arXiv.2209.02976.
[11]	Wang C Y, Yeh I H, Mark Liao H Y. YOLOv9:Learning what you want toLearn using programmable gradient information[C]// Computer Vision-ECCV 2024. Cham: Springer Nature Switzerland, 2025: 1-21.
[12]	Liu W, Anguelov D, Erhan D, et al. SSD:single shot MultiBox detector[M]// Computer Vision-ECCV 2016.Cham: Springer International Publishing, 2016: 21-37.
[13]	Zhou X, Wang D, Krhenbühl P. Objects as points[J/OL]. arXiv, 2019(2019-04-25)[2024-03/12].https://doi.org/10.48550/arXiv.1904.07850.
[14]	Shamsolmoali P, Zareapoor M, Yang J, et al. Enhanced single-shot detector for small object detection in remote sensing images[C]// IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2022: 1716-1719.
[15]	张路青, 郭莹. 基于卷积神经网络的遥感图像目标检测识别[J]. 舰船电子工程, 2023, 43(5):49-53.
	Zhang L Q, Guo Y. Remote sensing image object detection and reco-gnition based on convolutional neural network[J]. Ship Electronic Engineering, 2023, 43(5):49-53.
[16]	Cao S, Wang T, Li T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97:103936.
[17]	Li X, Wei Y, Li J, et al. Improved YOLOv7 algorithm for small object detection in unmanned aerial vehicle image scenarios[J]. Applied Sciences, 2024, 14(4):1664.
[18]	Wang C Y, Bochkovskiy A, Liao H M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2023: 7464-7475.
[19]	Tan M, Pang R, Le Q V. EfficientDet:Scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2020: 10781-10790.
[20]	李安达, 吴瑞明, 李旭东. 改进YOLOv7的小目标检测算法研究[J]. 计算机工程与应用, 2024, 60(1):122-134. doi: 10.3778/j.issn.1002-8331.2307-0004
	Li A D, Wu R M, Li X D. Research on improving YOLOv7’s small target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1):122-134. doi: 10.3778/j.issn.1002-8331.2307-0004
[21]	Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[EB/OL]. 2022: 2205.12740. https://arxiv.org/abs/2205.12740v1.
[22]	Qi Z, Ren Y, Long J, et al. Application of YOLOv7 in remote sen-sing image target detection[C]// 2023 42nd Chinese Control Conference (CCC).IEEE, 2023: 7603-7608.
[23]	Li H, Li J, Wei H, et al. Slim-neck by GSConv: A lightweight-design for real-time detector architectures[EB/OL]. 2022: 2206.02424. https://arxiv.org/abs/2206.02424v3.
[24]	Dai X, Chen Y, Xiao B, et al. Dynamic head:Unifying object detection heads with attentions[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2021: 7369-7378.
[25]	Wang J, Xu C, Yang W, et al. A normalized Gaussian Wasserstein distance for tiny object detection[EB/OL]. 2021: 2110.13389. https://arxiv.org/abs/2110.13389v2.
[26]	Ma S, Xu Y, Ma S, et al. MPDIoU: A loss for efficient and accurate bounding box regression[EB/OL]. 2023: 2307.07662. https://arxiv.org/abs/2307.07662v1.
[27]	Zheng Z, Wang P, Liu W, et al. Distance-IoU loss:Faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7):12993-13000.
[28]	Bochkovskiy A, Wang C Y, Liao H M. YOLOv4:Optimal speed and accuracy of object detection[EB/OL]. 2020: 2004.10934. https://arxiv.org/abs/2004.10934v1.
[29]	He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]// Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014: 346-361.
[30]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision (ICCV).IEEE, 2017: 2999-3007.
[31]	Chollet F. Xception: Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2017: 1800-1807.
[32]	Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J/OL]. ar-Xiv, 2017(2020-04-17)[2024-03/12].https://doi.org/10.48550/arXiv.1704.04861.
[33]	Zhang X, Zhou X, Lin M, et al. ShuffleNet:An extremely efficient convolutional neural network for mobile devices[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE, 2018: 6848-6856.
[34]	Zhang Y F, Ren W, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506:146-157.
[35]	Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images:A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159:296-307.
[36]	Xiao Z, Liu Q, Tang G, et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing, 2015, 36(2):618-644.

[1]	刘浩然, 严天笑, 朱月琴, 王艳萍, 陈祖谊, 杨昭颖, 朱浩濛. 基于一种改进YOLOv7算法模型的滑坡识别研究——以四川省白格地区为例[J]. 自然资源遥感, 2025, 37(4): 48-57.
[2]	何晓军, 罗杰. 结合上下文与类别感知特征融合的高分遥感图像语义分割[J]. 自然资源遥感, 2025, 37(2): 1-10.
[3]	徐欣钰, 李小军, 盖钧飞, 李轶鲲. 结合NSCT变换和引导滤波的多光谱图像全色锐化算法[J]. 自然资源遥感, 2025, 37(1): 24-30.
[4]	潘俊杰, 慎利, 鄢薪, 聂欣, 董宽林. 一种基于对抗学习的高分辨率遥感影像语义分割无监督域自适应方法[J]. 自然资源遥感, 2024, 36(4): 149-157.
[5]	郭勇, 张琳翔, 许泽宇, 蔡中祥. 结合桥梁难分样本优化的大清河流域水坝遥感检测[J]. 自然资源遥感, 2024, 36(4): 201-209.
[6]	郭彭浩, 邱建林, 赵淑男. 高频域多深度空洞网络的遥感图像全色锐化算法[J]. 自然资源遥感, 2024, 36(3): 146-153.
[7]	温泉, 李璐, 熊立, 杜磊, 刘庆杰, 温奇. 基于深度学习的遥感图像水体提取综述[J]. 自然资源遥感, 2024, 36(3): 57-71.
[8]	宋爽爽, 肖开斐, 刘昭华, 曾昭亮. 一种基于YOLOv5的高分辨率遥感影像目标检测方法[J]. 自然资源遥感, 2024, 36(2): 50-59.
[9]	赵鹤婷, 李小军, 徐欣钰, 盖钧飞. 基于ICM的高光谱图像自适应全色锐化算法[J]. 自然资源遥感, 2024, 36(2): 97-104.
[10]	林聃, 李秋岑, 陈志奎, 钟芳明, 李丽方. 多标签遥感图像分类研究现状与展望[J]. 自然资源遥感, 2024, 36(2): 10-20.
[11]	刘宇佳, 谢诗哲, 杜阳, 严瑾, 南燕云, 温中凯. 结合空间语义注意力的二段式遥感图像修复网络[J]. 自然资源遥感, 2024, 36(1): 58-66.
[12]	牛祥华, 黄微, 黄睿, 蒋斯立. 基于注意力特征融合的高保真遥感图像薄云去除[J]. 自然资源遥感, 2023, 35(3): 116-123.
[13]	王建强, 邹朝晖, 刘荣波, 刘志松. 基于U²-Net深度学习模型的沿海水产养殖塘遥感信息提取[J]. 自然资源遥感, 2023, 35(3): 17-24.
[14]	徐欣钰, 李小军, 赵鹤婷, 盖钧飞. NSCT和PCNN相结合的遥感图像全色锐化算法[J]. 自然资源遥感, 2023, 35(3): 64-70.
[15]	邱磊, 张学志, 郝大为. 基于深度学习的视频SAR动目标检测与跟踪算法[J]. 自然资源遥感, 2023, 35(2): 157-166.

Viewed

Full text

Abstract

Cited

Shared

Discussed