|
Abstract Aiming at the omission in the ship target detection from remote sensing images with complex background caused by the arbitrary and dense arrangement of ships, this study, based on the rotation region generation network, proposes a ship target detection algorithm using the multi-scale feature enhancement of remote sensing images. The detailed steps are as follows. Firstly, improve the feature pyramid network using the receptive field module with dense connection at the feature extraction stage. Then obtain the characteristics of multi-scale receptive fields using the convolution of different dilate rates. In this way, the expression of high-level semantic information can be enhanced. Then design a feature fusion structure based on attention mechanisms to restrain noise and highlight the target characteristics. Afterward, fuse all layers according to the spatial weight value of each layer to obtain a feature layer that takes both semantic and position information into account. Then conduct attention enhancement to the features of this layer, and integrate the enhanced features into the original feature layer in the pyramid network. Consequently, pay more attention to target locations by increasing attention loss and optimizing the attention network according to the classification and regression loss. As indicated by the experiment results of DOTA remote sensing dataset, the average precision of this algorithm is as high as 71.61%, which is higher than the latest ship target detection algorithm based on remote sensing images. In this manner, the omission in ship target detection can be effectively solved.
|
Keywords
convolution neural network
multi-scale feature fusion
attention mechanism
remote sensing image
ship target detection
|
|
Corresponding Authors:
GAO Jiankang
E-mail: liuwanjun@lntu.edu.cn;1554797460@qq.com
|
Issue Date: 24 September 2021
|
|
|
[1] |
王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述[J]. 自动化学报, 2011, 37(9):1029-1039.
|
[1] |
Wang Y Q, Ma L, Tian Y. Overview of ship target detection and recognition based on optical remote sensing image[J]. Acta Automatica Sinica, 2011, 37(9):1029-1039.
|
[2] |
谢奇芳, 姚国清, 张猛. 基于Faster R-CNN的高分辨率图像目标检测技术[J]. 国土资源遥感, 2019, 31(2):38-43.doi: 10.6046/gtzyyg.2019.02.06.
doi: 10.6046/gtzyyg.2019.02.06
|
[2] |
Xie Q F, Yao G Q, Zhang M. Research on high resolution image object detection technology based on Faster R-CNN[J]. Remote Sensing for Land and Resources, 2019, 31(2):38-43.doi: 10.6046/gtzyyg.2019.02.06.
doi: 10.6046/gtzyyg.2019.02.06
|
[3] |
史文旭, 江金洪, 鲍胜利. 基于特征融合的遥感图像舰船目标检测方法[J]. 光子学报, 2020, 49(7):57-67.
|
[3] |
Shi W X, Jiang J H, Bao S L. Ship target detection in remote sensing image based on feature fusion[J]. Acta Photonica Sinica, 2020, 49(7):57-67.
|
[4] |
Szegedy C, et al. Going deeper with convolutions[C]// IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Boston,MA, 2015:1-9.
|
[5] |
Redmon J, Divvala S, Girshick R, et al. You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Las Vegas,NV, 2016:779-788.
|
[6] |
Liu W, Anguelov D, Erhan D, et al. Ssd:Single shot multibox detector[C]// European Conference on Computer Vision,Springer,Cham, 2016:21-37.
|
[7] |
Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[C]// Advances in neural information processing systems, 2015:91-99.
|
[8] |
Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:2117-2125.
|
[9] |
He K, Gkioxari G, Dollár P, et al. Mask R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision, 2017:2961-2969.
|
[10] |
Ma J. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11):3111-3122.
doi: 10.1109/TMM.2018.2818020
url: https://ieeexplore.ieee.org/document/8323240/
|
[11] |
Yang X, Sun H, Fu K, et al. Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks[J]. Remote Sensing, 2018, 10(1):132.
doi: 10.3390/rs10010132
url: http://www.mdpi.com/2072-4292/10/1/132
|
[12] |
Zhu Y, Mu J, Pu H, et al. FRFB:Integrate receptive field block into feature fusion net for single shot multibox detector[C]// 2018 14th International Conference on Semantics,Knowledge and Grids(SKG),Guangzhou,China, 2018:173-180.
|
[13] |
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Las Vegas,NV, 2016:2818-2826.
|
[14] |
Huang G, Liu Z, Der Maaten L V, et al. Densely connected convolutional networks[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),Honolulu,HI, 2017:2261-2269.
|
[15] |
Pang J, Chen K, Shi J, et al. Libra R-CNN:Towards balanced learning for object detection[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Long Beach,CA,USA, 2019:821-830.
|
[16] |
Wang X, Girshick R, Gupta A, et al. Non-local neural networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,UT, 2018:7794-7803.
|
[17] |
Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[J]. Lecture Notes in Computer Science, 2018:3-19.
|
[18] |
Hu J, Shen J, Sun G. Squeeze-and-excitation networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,UT, 2018:7132-7141.
|
[19] |
Han J, Zhou P, Zhang D, et al. Efficient,simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding[J]. ISPRS Journal of Photogrammetry & Remote Sensing, 2014, 89:37-48.
|
[20] |
Xia G, et al. 2018. DOTA:A large-scale dataset for object detection in aerial images[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,UT, 2018:3974-3983.
|
[21] |
Li Y, Huang Q, Pei X, et al. RADet:Refine feature pyramid network and multi-layer attention network for arbitrary-oriented object detection of remote sensing images[J]. Remote Sensing, 2020, 12(3):389.
doi: 10.3390/rs12030389
url: https://www.mdpi.com/2072-4292/12/3/389
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|