轻量化YOLOv7-tiny的遥感图像小目标检测
Small target detection in remote sensing images based on lightweight YOLOv7-tiny
-
摘要: 针对遥感图像尺度变化大、场景信息复杂、小目标特征信息较少等导致的检测精度较低和当前目标检测模型参数量大、复杂性高导致的检测效率低的问题,该文提出了一种轻量化的YOLOv7-tiny遥感图像检测算法。首先,使用组混洗卷积(group shuffle convolution, GSConv)和VoV-GSCSP模块改进网络颈部,在保持足够检测精度的同时减少模型的计算量和网络结构的复杂性; 其次,在预测时采用一种结合注意力机制的动态预测头(dynamic head, DyHead),通过在尺度感知的特征层、空间感知的空间位置及任务感知的输出通道内,结合多头自注意机制,提高目标检测头的性能; 最后,利用基于Wasserstein距离的小目标检测评估方法(normalized Wasserstein distance, NWD)结合基于最小点距离的边界框回归损失函数(minimum points distance intersection over union, MPDIoU)来优化原模型的损失函数,增强对小目标检测的鲁棒性。实验结果表明,本文所提出的算法在DIOR数据集和RSOD数据集的mAP@0.5分别达到87.7%和94.7%,比原YOLOv7-tiny模型分别提高了2.7百分点和5.1百分点,且每秒检测帧率(frames per second,FPS)分别提高了12.2%和11.9%,能够有效提高遥感图像小目标检测的精度和实时性。Abstract: To address the issues of low detection accuracy caused by significant scale variations, complex scenes, and limited feature information of small targets in remote sensing images, as well as low detection efficiency resulting from the large parameter size and high complexity of current object detection models, this study proposes a lightweight YOLOv7-tiny model for remote sensing image detection. First, the network neck was improved by incorporating group shuffle convolution (GSConv) and VoV-GSCSP modules. This allows for sufficient detection accuracy while reducing computational costs and network complexity. Second, a dynamic head (DyHead) combined with an attention mechanism was adopted during prediction. The performance of the detection head was enhanced using multi-head self-attention across scale-aware feature layers, spatially-aware positions, and task-aware output channels. Finally, the loss function of the original model was optimized by integrating the normalized Wasserstein distance (NWD) metric for small-target assessment and a bounding box regression loss function based on the minimum point distance IoU (MPDIoU). This assists in enhancing robustness for small target detection. The experimental results demonstrate that the proposed algorithm achieved mAP@50 scores of 87.7% and 94.7% on the DIOR and RSOD datasets, respectively, indicating increases of 2.7 and 5.1 percentage points compared to the original YOLOv7-tiny model. Furthermore, the frames per second (FPS) increased by 12.2% and 11.9%, respectively. Therefore, the proposed algorithm can effectively enhance both the accuracy and real-time performance of small target detection from remote sensing images.
下载: