Detecting ships from SAR images based on high-dimensional contextual attention and dual receptive field enhancement

doi:10.6046/zrzyyg.2024047

Abstract
Figures/Tables
References
Related Articles
Metrics

Download: PDF(3449 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

The abundant contextual information in synthetic aperture radar (SAR) images remains underutilized in deep learning-based ship detection. Hence, this study proposed a novel method for detecting ships from SAR images based on high-dimensional contextual attention and dual receptive field enhancement. The dual receptive field enhancement was employed to extract multi-dimensional feature information from SAR images, thereby guiding the dynamic attention matrix to learn rich contextual information during the coarse-to-fine extraction of high-dimensional features. Based on YOLOv7, a YOLO-HD network was constructed by incorporating a lightweight convolutional module, a lightweight asymmetric multi-level compression detection head, and a new loss function,XIoU. A comparative experiment was conducted on the E-HRSID and SSDD datasets. The proposed method achieved average detection accuracy of 91.36 % and 97.64 %, respectively, representing improvements by 4.56 and 9.83 percentage points compared to the original model, and outperforming other classical models.

Keywords deep learning computer vision YOLOv7 synthetic aperture radar (SAR) image ship detection attention mechanism

ZTFLH:

TP79

Issue Date: 01 July 2025

	Service

	E-mail this article
	E-mail Alert
	RSS
	Articles by authors

	Wei GUO
	Yu LI
	Haibo JIN

Cite this article:

Wei GUO,Yu LI,Haibo JIN. Detecting ships from SAR images based on high-dimensional contextual attention and dual receptive field enhancement[J]. Remote Sensing for Natural Resources, 2025, 37(3): 104-112.

URL:

https://www.gtzyyg.com/EN/10.6046/zrzyyg.2024047 OR https://www.gtzyyg.com/EN/Y2025/V37/I3/104

Fig.1 YOLO-HD algorithm model diagram

Fig.2 Implementation details of high-dimensional contextual attention structure

Fig.3 Dual receptive field enhancement structure

Fig.4 Implementation details of lightweight convolution layer structure

Tab.1 Comparison experiment results of E-HRSID and SSDD datasets

Fig.5 P-R curve graph of each model

Tab.2 Detection results of four models

Tab.3 Ablation experiment

[1]	曾涛, 温育涵, 王岩, 等. 合成孔径雷达参数化成像技术进展[J]. 雷达学报, 2021, 10(3):327-341.
[1]	Zeng T, Wen Y H, Wang Y, et al. Research progress on synthetic aperture Radar parametric imaging methods[J]. Journal of Radars, 2021, 10(3):327-341.
[2]	Gong M, Li Y, Jiao L, et al. SAR change detection based on intensity and texture changes[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014,93:123-135.
[3]	El-Darymli K, McGuire P, Power D, et al. Target detection in synthetic aperture Radar imagery:A state-of-the-art survey[J]. Journal of Applied Remote Sensing, 2013, 7(1):071598.
[4]	张帆, 陆圣涛, 项德良, 等. 一种改进的高分辨率SAR图像超像素CFAR舰船检测算法[J]. 雷达学报, 2023, 12(1):120-139.
[4]	Zhang F, Lu S T, Xiang D L, et al. An improved superpixel-based CFAR method for high-resolution SAR image ship target detection[J]. Journal of Radars, 2023, 12(1):120-139.
[5]	Sugimoto M. SAR image analysis target detection utilizing polarimetricinformation[D]. Yokosukashi: National Defense Academy Graduate School of Science and Engineering, 2013.
[6]	Wang C, Wang Y, Liao M. Removal of azimuth ambiguities and detection of a ship:Usingpolarimetric airborne C-band SAR images[J]. International Journal of Remote Sensing, 2012, 33(10):3197-3210.
[7]	Liu T, Yang Z, Marino A, et al. PolSAR ship detection based on neighborhood polarimetric covariance matrix[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(6):4874-4887.
[8]	Zhang P, Zhang J F, Liu T. Constant false alarm rate detection of slow targets in polarimetric along-track interferometric synthetic aperture radar imagery[J]. IET Radar,Sonar& Navigation, 2019, 13(1):31-44.
[9]	张鹏, 张嘉峰, 刘涛. 基于相干度优化的极化顺轨干涉SAR慢小目标CFAR检测[J]. 北京航空航天大学学报, 2019, 45(3):575-587.
[9]	Zhang P, Zhang J F, Liu T. Slow and small target CFAR detection of polarimetric along-track interferometric SAR using coherence optimization[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(3):575-587.
[10]	Girshick R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV).December 7-13,2015,Santiago,Chile.IEEE, 2015:1440-1448.
[11]	Liu W, Anguelov D, Erhan D, et al. SSD:single shot MultiBox detector[M]// Lecture Notes in Computer Science. Cham:SpringerInternational Publishing, 2016:21-37.
[12]	Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).June 17-24,2023,Vancouver,BC,Canada.IEEE, 2023:7464-7475.
[13]	Li Y, Yao T, Pan Y, et al. Contextual transformer networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2):1489-1500.
[14]	Rao Y, Zhao W, Tang Y, et al. Hornet:Efficient high-order spatial interactions with recursive gated convolutions[J]. Advances in Neural Information Processing Systems, 2022,35:10353-10366.
[15]	Vaswani A. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017.
[16]	Li L, Li B, Zhou H. Lightweight multi-scale network for small object detection[J]. PeerJ Computer Science, 2022,8:e1145.
[17]	Yu Z, Huang H, Chen W, et al. YOLO-FaceV2:A scale and occlusion aware face detector[J]. Pattern Recognition, 2024,155:110714.
[18]	Chen J, Kao S H, He H, et al. Run,don’twalk:Chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).June 17-24,2023,Vancouver,BC,Canada.IEEE, 2023:12021-12031.
[19]	Chollet F. Xception:Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).July 21-26,2017,Honolulu,HI,USA.IEEE, 2017:1800-1807.
[20]	Ioannou Y, Robertson D, Cipolla R, et al. Deeproots:Improving CNN efficiency with hierarchical filter groups[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).July 21-26,2017,Honolulu,HI,USA.IEEE, 2017:5977-5986.
[21]	Wei S, Zeng X, Qu Q, et al. HRSID:A high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020,8:120234-120254.
[22]	Zhang T, Zhang X, Li J, et al. SAR ship detection dataset (SSDD):Official release and comprehensive data analysis[J]. Remote Sensing, 2021, 13(18):3690.
[23]	Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation[J]. BMC Genomics, 2020, 21(1):6. doi: 10.1186/s12864-019-6413-7 pmid: 31898477
[24]	Hou X, Wang D, Krähenbühl P. Objects as points[J/OL]. arXiv, 2019. https://arxiv.org/pdf/1904.07850. url: https://arxiv.org/pdf/1904.07850
[25]	Tan M, Pang R, Le Q V. EfficientDet:Scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).June 13-19,2020,Seattle,WA,USA.IEEE, 2020:10778-10787.
[26]	Ren S, He K, Girshick R, et al. FasterR-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[27]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision (ICCV).October 22-29,2017,Venice,Italy.IEEE, 2017:2999-3007.
[28]	Redmon J, Divvala S, Girshick R, et al. You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).June 27-30,2016,Las Vegas,NV,USA.IEEE, 2016:779-788.

[1]	ZOU Haijing, ZOU Bin, WANG Yulong, ZHANG Bo, ZOU Lunwen. Remote sensing identification of industrial solid waste and open pits in mining areas based on the multiscale sample set optimization strategy[J]. Remote Sensing for Natural Resources, 2025, 37(3): 1-8.
[2]	CHEN Min, PENG Shuan, WANG Tao, WU Xuefang, LIU Runpu, CHEN Yushuo, FANG Yanru, YANG Pingjian. A comparative study of water body classification of wetlands based on hyperspectral images from the ZY1-02D satellite: A case study of the Baiyangdian wetland[J]. Remote Sensing for Natural Resources, 2025, 37(3): 133-141.
[3]	CEHN Lanlan, FAN Yongchao, XIAO Haiping, WAN Junhui, CHEN Lei. Predicting surface subsidence in large-scale mining areas based on time-series InSAR and the IRIME-LSTM model[J]. Remote Sensing for Natural Resources, 2025, 37(3): 245-252.
[4]	YE Wujian, XIE Linfeng, LIU Yijun, WEN Xiaozhuo, Li Yang. A MobileNet-based lightweight cloud detection model[J]. Remote Sensing for Natural Resources, 2025, 37(3): 95-103.
[5]	ZHENG Zongsheng, GAO Meng, ZHOU Wenhuan, WANG Zhenghan, HUO Zhijun, ZHANG Yuewei. Densely connected multiscale semantic segmentation for land cover based on the iterative optimization strategy for samples[J]. Remote Sensing for Natural Resources, 2025, 37(2): 11-18.
[6]	PANG Min. An intelligent platform for extracting patches from multisource domestic satellite images and its application[J]. Remote Sensing for Natural Resources, 2025, 37(2): 148-154.
[7]	LIU Chenchen, GE Xiaosan, WU Yongbin, YU Haikun, ZHANG Beibei. A method for information extraction of buildings from remote sensing images based on hybrid attention mechanism and Deeplabv3+[J]. Remote Sensing for Natural Resources, 2025, 37(1): 31-37.
[8]	QU Haicheng, WANG Ying, LIU Lamei, HAO Ming. Information extraction of roads from remote sensing images using CNN combined with Transformer[J]. Remote Sensing for Natural Resources, 2025, 37(1): 38-45.
[9]	HUANG Chuan, LI Yaqin, QI Yueran, WEI Xiaoyan, SHAO Yuanzheng. A hyperspectral unmixing and few-shot classification method based on 3DCAE network[J]. Remote Sensing for Natural Resources, 2025, 37(1): 8-14.
[10]	ZHENG Zongsheng, WANG Zhenghan, WANG Zhenhua, LU Peng, GAO Meng, HUO Zhijun. An improved 3D Octave convolution-based method for hyperspectral image classification[J]. Remote Sensing for Natural Resources, 2024, 36(4): 82-91.
[11]	QU Haicheng, LIANG Xu. Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement[J]. Remote Sensing for Natural Resources, 2024, 36(4): 107-116.
[12]	ZHANG Ruirui, XIA Lang, CHEN Liping, DING Chenchen, ZHENG Aichun, HU Xinmiao, YI Tongchuan, CHEN Meixiang, CHEN Tianen. Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing[J]. Remote Sensing for Natural Resources, 2024, 36(3): 216-224.
[13]	WEN Quan, LI Lu, XIONG Li, DU Lei, LIU Qingjie, WEN Qi. A review of water body extraction from remote sensing images based on deep learning[J]. Remote Sensing for Natural Resources, 2024, 36(3): 57-71.
[14]	BAI Shi, TANG Panpan, MIAO Zhao, JIN Caifeng, ZHAO Bo, WAN Haoming. Information extraction of landslides based on high-resolution remote sensing images and an improved U-Net model: A case study of Wenchuan, Sichuan[J]. Remote Sensing for Natural Resources, 2024, 36(3): 96-107.
[15]	SONG Shuangshuang, XIAO Kaifei, LIU Zhaohua, ZENG Zhaoliang. A YOLOv5-based target detection method using high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(2): 50-59.

Viewed

Full text

Abstract

Cited

Shared

Discussed