Multi-directional target detection based on depth features
YU Miao1(), JING Hongbo1, WANG Xiang2, LI Xingjiu1
1. Beijing Urban Construction Survey, Design and Research Institute Co., Ltd., Beijing 100101, China 2. Emerging Hua’an Smart Technology Co., Ltd.,Beijing 100160, China
In recent years, target detection, as an important branch of computer vision technology, has been widely applied in fields such as medicine, military affairs, and urban rail transit. As satellite and remote sensing technologies advance, images obtained using these technologies contain abundant information. This makes it crucial to conduct automatic target detection and understanding of these images. However, due to the random directions and dense distribution of targets in remote sensing images, conventional methods are prone to lead to missing or incorrect detection. In response, this study proposes a multi-convolution kernel feature combination-based adaptive region proposal network (MFCARPN) algorithm for multi-directional detection. This algorithm introduces multiple convolution kernel features for target extraction. The weight parameters of these convolution kernel features can be determined through adaptive learning according to the differences between the targets, yielding the characteristic patterns that match better with targets. Meanwhile, in combination with the original features of the targets, the parameters of the classification and regression model vary dynamically according to the difference between targets. Thus, the RPN’s adaptive ability can be improved. The experimental results indicate that the mAP of the standard dataset DOTA reached up to 75.52%, which is 0.5 percentages higher than that of the baseline algorithm GV. Therefore, the MFCARPN algorithm proposed in this study proves effective.
Tang T, Zhou S, Deng Z, et al. Arbitrary-oriented vehicle detection in aerial imagery with single convolutional neural networks[J]. Remote Sensing, 2017, 9(11):1170.
Guo J, Ning Z Y, Wang Y T, et al. Research and application of construction safety monitoring based on video big data[J]. Science and Technology & Innovation, 2023(6):153-159.
Jiang S Y. Research on abrasion defect detection technology of urban rail transit pantograph and catenary based on computer vision[D]. Beijing: Beijing Jiaotong University, 2020.
[4]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communication of the ACM, 2017, 60(6):84-90.
[5]
Redmon J, Farhadi A. YOLOv3:An incremental improvement[J/OL]. arXiv, 2018(2018-04-08). https://arxiv.org/abs/1804.02767.
[6]
Tian Z, Shen C, Chen H, et al. FCOS:Fully convolutional one-stage object detection[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV).IEEE, 2020:9627-9636.
[7]
Jiang Y, Zhu X, Wang X, et al. R2CNN:Rotational region CNN for orientation robust scene text detection[J/OL]. arXiv, 2017(2017-06-29). https://arxiv.org/abs/1706.09579.
[8]
Xia G S, Bai X, Ding J, et al. DOTA:A large-scale dataset for object detection in aerial images[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE, 2018:3974-3983.
Wang Z H, Tan Z L, Li J, et al. Re-YOLOX:A YOLOX model for identifying nearshore monitoring targets improved based on the Resizer model[J]. Remote Sensing for Natural Resources, 2023, 35(3):10-16.doi:10.6046/zrzyyg.2022425.
[10]
Xu Y, Fu M, Wang Q, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(4):1452-1459.
[11]
Pan X, Ren Y, Sheng K, et al. Dynamic refinement network for oriented and densely packed object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE, 2020:11204-11213.
[12]
Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.