Small target detection in remote sensing images based on lightweight YOLOv7-tiny

doi:10.6046/zrzyyg.2024102

Abstract
Figures/Tables
References
Related Articles
Metrics

Download: PDF(5695 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

To address the issues of low detection accuracy caused by significant scale variations, complex scenes, and limited feature information of small targets in remote sensing images, as well as low detection efficiency resulting from the large parameter size and high complexity of current object detection models, this study proposes a lightweight YOLOv7-tiny model for remote sensing image detection. First, the network neck was improved by incorporating group shuffle convolution (GSConv) and VoV-GSCSP modules. This allows for sufficient detection accuracy while reducing computational costs and network complexity. Second, a dynamic head (DyHead) combined with an attention mechanism was adopted during prediction. The performance of the detection head was enhanced using multi-head self-attention across scale-aware feature layers, spatially-aware positions, and task-aware output channels. Finally, the loss function of the original model was optimized by integrating the normalized Wasserstein distance (NWD) metric for small-target assessment and a bounding box regression loss function based on the minimum point distance IoU (MPDIoU). This assists in enhancing robustness for small target detection. The experimental results demonstrate that the proposed algorithm achieved mAP@50 scores of 87.7% and 94.7% on the DIOR and RSOD datasets, respectively, indicating increases of 2.7 and 5.1 percentage points compared to the original YOLOv7-tiny model. Furthermore, the frames per second (FPS) increased by 12.2% and 11.9%, respectively. Therefore, the proposed algorithm can effectively enhance both the accuracy and real-time performance of small target detection from remote sensing images.

Keywords remote sensing images object detection YOLOv7-tiny GSConv MPDIoU DyHead

ZTFLH:

TP79

Issue Date: 03 September 2025

	Service

	E-mail this article
	E-mail Alert
	RSS
	Articles by authors

	Ziyao XU
	Wu YANG
	Xiaolong SHI

Cite this article:

Ziyao XU,Wu YANG,Xiaolong SHI. Small target detection in remote sensing images based on lightweight YOLOv7-tiny[J]. Remote Sensing for Natural Resources, 2025, 37(4): 1-11.

URL:

https://www.gtzyyg.com/EN/10.6046/zrzyyg.2024102 OR https://www.gtzyyg.com/EN/Y2025/V37/I4/1

Fig.1 Optimized structure of YOLOv7-tiny network

Fig.2 GSConv module

Fig.3 VoV-GSCSP bottleneck unit module and VoV-GSCSP module

Fig.4 DyHead structure

Tab.1 Information about the DIOR dataset and the RSOD dataset

Fig.5 DIOR dataset

Fig.6 RSOD dataset

Fig.7 Precision, recall, mAP curve of the YOLOv7-tiny before and after improvement

Tab.2 Ablation experiment Comparison of results

Tab.3 Experiment results comparison of different algorithms on the DIOR dataset

Tab.4 Experiment results comparison of different algorithms on the RSOD dataset

Fig.8-1 Comparison of detection results between the proposed algorithm and YOLOv7-tiny on the DIOR dataset

Fig.8-2 Comparison of detection results between the proposed algorithm and YOLOv7-tiny on the DIOR dataset

[1]	Zou Z, Chen K, Shi Z, et al. Object detection in 20 years:A survey[J]. Proceedings of the IEEE, 2023, 111(3):257-276.
[2]	Dai J, Li Y, He K, et al. R-FCN:Object detection via region-based fully convolutional networks[J]. Advances in neural information processing systems, 2016, 29:379-387.
[3]	Zaidi S S A, Ansari M S, Aslam A, et al. A survey of modern deep learning based object detection models[J]. Digital Signal Processing, 2022, 126:103514.
[4]	付涵, 范湘涛, 严珍珍, 等. 基于深度学习的遥感图像目标检测技术研究进展[J]. 遥感技术与应用, 2022, 37(2):290-305. doi: 10.11873/j.issn.1004-0323.2022.2.0290
[4]	Fu H, Fan X T, Yan Z Z, et al. Progress of object detection in remote sensing images based on deep learning[J]. Remote Sensing Technology and Application, 2022, 37(2):290-305.
[5]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition.June 23-28,2014,Columbus,OH,USA.IEEE, 2014: 580-587.
[6]	Girshick R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV).December 7-13, 2015: 1440-1448.
[7]	Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[8]	Redmon J, Divvala S, Girshick R, et al. You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).June 27-30,2016,Las Vegas,NV,USA.IEEE, 2016: 779-788.
[9]	Redmon J, Farhadi A. YOLO9000:Better,faster,stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2017: 6517-6525.
[10]	Li C, Li L, Jiang H, et al. YOLOv6:A single-stage object detection framework for industrial applications[J/OL]. arXiv, 2022(2022-09-07)[2024-03/12].https://doi.org/10.48550/arXiv.2209.02976. url: https://doi.org/10.48550/arXiv.2209.02976.
[11]	Wang C Y, Yeh I H, Mark Liao H Y. YOLOv9:Learning what you want toLearn using programmable gradient information[C]// Computer Vision-ECCV 2024. Cham: Springer Nature Switzerland, 2025: 1-21.
[12]	Liu W, Anguelov D, Erhan D, et al. SSD:single shot MultiBox detector[M]// Computer Vision-ECCV 2016.Cham: Springer International Publishing, 2016: 21-37.
[13]	Zhou X, Wang D, Krhenbühl P. Objects as points[J/OL]. arXiv, 2019(2019-04-25)[2024-03/12].https://doi.org/10.48550/arXiv.1904.07850. url: https://doi.org/10.48550/arXiv.1904.07850.
[14]	Shamsolmoali P, Zareapoor M, Yang J, et al. Enhanced single-shot detector for small object detection in remote sensing images[C]// IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2022: 1716-1719.
[15]	张路青, 郭莹. 基于卷积神经网络的遥感图像目标检测识别[J]. 舰船电子工程, 2023, 43(5):49-53.
[15]	Zhang L Q, Guo Y. Remote sensing image object detection and reco-gnition based on convolutional neural network[J]. Ship Electronic Engineering, 2023, 43(5):49-53.
[16]	Cao S, Wang T, Li T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97:103936.
[17]	Li X, Wei Y, Li J, et al. Improved YOLOv7 algorithm for small object detection in unmanned aerial vehicle image scenarios[J]. Applied Sciences, 2024, 14(4):1664.
[18]	Wang C Y, Bochkovskiy A, Liao H M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2023: 7464-7475.
[19]	Tan M, Pang R, Le Q V. EfficientDet:Scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2020: 10781-10790.
[20]	李安达, 吴瑞明, 李旭东. 改进YOLOv7的小目标检测算法研究[J]. 计算机工程与应用, 2024, 60(1):122-134. doi: 10.3778/j.issn.1002-8331.2307-0004
[20]	Li A D, Wu R M, Li X D. Research on improving YOLOv7’s small target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1):122-134. doi: 10.3778/j.issn.1002-8331.2307-0004
[21]	Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[EB/OL]. 2022: 2205.12740. https://arxiv.org/abs/2205.12740v1. url: https://arxiv.org/abs/2205.12740v1
[22]	Qi Z, Ren Y, Long J, et al. Application of YOLOv7 in remote sen-sing image target detection[C]// 2023 42nd Chinese Control Conference (CCC).IEEE, 2023: 7603-7608.
[23]	Li H, Li J, Wei H, et al. Slim-neck by GSConv: A lightweight-design for real-time detector architectures[EB/OL]. 2022: 2206.02424. https://arxiv.org/abs/2206.02424v3. url: https://arxiv.org/abs/2206.02424v3
[24]	Dai X, Chen Y, Xiao B, et al. Dynamic head:Unifying object detection heads with attentions[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2021: 7369-7378.
[25]	Wang J, Xu C, Yang W, et al. A normalized Gaussian Wasserstein distance for tiny object detection[EB/OL]. 2021: 2110.13389. https://arxiv.org/abs/2110.13389v2. url: https://arxiv.org/abs/2110.13389v2
[26]	Ma S, Xu Y, Ma S, et al. MPDIoU: A loss for efficient and accurate bounding box regression[EB/OL]. 2023: 2307.07662. https://arxiv.org/abs/2307.07662v1. url: https://arxiv.org/abs/2307.07662v1
[27]	Zheng Z, Wang P, Liu W, et al. Distance-IoU loss:Faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7):12993-13000.
[28]	Bochkovskiy A, Wang C Y, Liao H M. YOLOv4:Optimal speed and accuracy of object detection[EB/OL]. 2020: 2004.10934. https://arxiv.org/abs/2004.10934v1. url: https://arxiv.org/abs/2004.10934v1
[29]	He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]// Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014: 346-361.
[30]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision (ICCV).IEEE, 2017: 2999-3007.
[31]	Chollet F. Xception: Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2017: 1800-1807.
[32]	Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J/OL]. ar-Xiv, 2017(2020-04-17)[2024-03/12].https://doi.org/10.48550/arXiv.1704.04861. url: https://doi.org/10.48550/arXiv.1704.04861.
[33]	Zhang X, Zhou X, Lin M, et al. ShuffleNet:An extremely efficient convolutional neural network for mobile devices[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE, 2018: 6848-6856.
[34]	Zhang Y F, Ren W, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506:146-157.
[35]	Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images:A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159:296-307.
[36]	Xiao Z, Liu Q, Tang G, et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing, 2015, 36(2):618-644.

[1]	LIU Haoran, YAN Tianxiao, ZHU Yueqin, WANG Yanping, CHEN Zuyi, YANG Zhaoying, ZHU Haomeng. Landslide identification based on an improved YOLOv7 model: A case study of the Baige area[J]. Remote Sensing for Natural Resources, 2025, 37(4): 48-57.
[2]	DENG Jianming, YAO Hang, FU Bolin, GU Sen, TANG Jie, GAN Yuanyuan. Monitoring the spatiotemporal dynamics of mangrove forests in Beibu Gulf, Guangxi Zhuang Autonomous Region, China, using Google Earth Engine and time-series active and passive remote sensing images[J]. Remote Sensing for Natural Resources, 2025, 37(2): 235-245.
[3]	CHEN Jiaxue, XIAO Dongsheng, CHEN Hongyu. A boundary guidance and cross-scale information interaction network for water body extraction from remote sensing images[J]. Remote Sensing for Natural Resources, 2025, 37(1): 15-23.
[4]	PAN Junjie, SHEN Li, YAN Xin, NIE Xin, DONG Kuanlin. An adversarial learning-based unsupervised domain adaptation method for semantic segmentation of high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(4): 149-157.
[5]	GUO Yong, ZHANG Linxiang, XU Zeyu, CAI Zhongxiang. Remote sensing-based detection of dams in the Daqing River basin through optimization using hard negative samples of bridges[J]. Remote Sensing for Natural Resources, 2024, 36(4): 201-209.
[6]	TAI Jiayi, SHEN Li, QIAO Wenfan, ZHOU Wuzhen. Impacts of different proportions of contextual information on the construction of sample sets of remote sensing scene images for damaged buildings[J]. Remote Sensing for Natural Resources, 2024, 36(3): 154-162.
[7]	LIN Dan, LI Qiucen, CHEN Zhikui, ZHONG Fangming, LI Lifang. Research advances and challenges in multi-label classification of remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(2): 10-20.
[8]	TANG Hui, ZOU Juan, YIN Xianghong, YU Shuchen, HE Qiuhua, ZHAO Dong, ZOU Cong, LUO Jianqiang. River and lake sand mining in the Dongting Lake area: Supervision based on high-resolution remote sensing images and typical case analysis[J]. Remote Sensing for Natural Resources, 2023, 35(3): 302-309.
[9]	ZHANG Xian, LI Wei, CHEN Li, YANG Zhaoying, DOU Baocheng, LI Yu, CHEN Haomin. Research progress and prospect of remote sensing-based feature extraction of opencast mining areas[J]. Remote Sensing for Natural Resources, 2023, 35(2): 25-33.
[10]	SUN Yu, HUANG Liang, ZHAO Junsan, CHANG Jun, CHEN Pengdi, CHENG Feifei. High spatial resolution automatic detection of bridges with high spatial resolution remote sensing images based on random erasure and YOLOv4[J]. Remote Sensing for Natural Resources, 2022, 34(2): 97-104.
[11]	XUE Bai, WANG Yizhe, LIU Shuhan, YUE Mingyu, WANG Yiying, ZHAO Shihu. Change detection of high-resolution remote sensing images based on Siamese network[J]. Remote Sensing for Natural Resources, 2022, 34(1): 61-66.
[12]	ZHANG Chengye, XING Jianghe, LI Jun, SANG Xiao. Recognition of the spatial scopes of tailing ponds based on U-Net and GF-6 images[J]. Remote Sensing for Natural Resources, 2021, 33(4): 252-257.
[13]	SANG Xiao, ZHANG Chengye, LI Jun, ZHU Shoujie, XING Jianghe, WANG Jinyang, WANG Xingjuan, LI Jiayao, YANG Ying. Application of intensity analysis theory in the land use change in Yijin Holo Banner under the background of coal mining[J]. Remote Sensing for Natural Resources, 2021, 33(3): 148-155.
[14]	CHEN Jing, CHEN Jingbo, MENG Yu, DENG Yupeng, JIE Yongshi, ZHANG Yi. Detection of wind turbine towers in remote sensing based on YOLOv3 model under scale and density constraints[J]. Remote Sensing for Natural Resources, 2021, 33(3): 54-62.
[15]	LU Qi, QIN Jun, YAO Xuedong, WU Yanlan, ZHU Haochen. Buildings extraction of GF-2 remote sensing image based on multi-layer perception network[J]. Remote Sensing for Land & Resources, 2021, 33(2): 75-84.

Viewed

Full text

Abstract

Cited

Shared

Discussed