Please wait a minute...
 
Remote Sensing for Land & Resources    2020, Vol. 32 Issue (4) : 53-60     DOI: 10.6046/gtzyyg.2020.04.08
|
Multi-scale architecture search method for remote sensing object detection
PEI Chan(), LIAO Tiejun()
College of Resources and Environment, Southwest University, Chongqing 400715, China
Download: PDF(4629 KB)   HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  

With the concept of “smart city”, remote sensing target detection has gradually become an important way for town planning, construction and maintenance. In order to characterize the differentiating remote sensing features of different cities and solve the problem of uneven generalization ability in the model on objects of different scales, this paper proposes a pyramid structure search method based on hybrid separation and convolution. Firstly, this paper analyzes the spatial distribution characteristics of the remote sensing image dataset, also constructs a multi-receptive field hybrid convolution search space based on its characteristics, and then trains the weights in its sub-network. Secondly, the number and structure of feature extraction units are searched cyclically with the help of reinforcement learning algorithms for the convergent loss value sequence. Finally, when the architecture reward function is stable, the corresponding architecture parameters and weight matrix are fixed, so that the cross-scale information of the image can be adaptively fused on the test data to improve the positioning accuracy of similar targets at different resolutions. The average accuracy of the network searched by this method on the DIOR remote sensing dataset is 78.6%, which is 6 percentage points higher than that of CornerNet, 1.6 percentage points higher than that of Cascade R-CNN, and the accuracy of small objects is 2.1 percentage points higher than that of Cascade R-CNN. The optimization ability of multi-scale architecture search in remote sensing target detection was confirmed.

Keywords deep learning      remote sensing detection      architectural search      image pyramid      recall     
:  TP183  
  TP751  
Corresponding Authors: LIAO Tiejun     E-mail: 1371566711@qq.com;ltjhy-007@163.com
Issue Date: 23 December 2020
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Chan PEI
Tiejun LIAO
Cite this article:   
Chan PEI,Tiejun LIAO. Multi-scale architecture search method for remote sensing object detection[J]. Remote Sensing for Land & Resources, 2020, 32(4): 53-60.
URL:  
https://www.gtzyyg.com/EN/10.6046/gtzyyg.2020.04.08     OR     https://www.gtzyyg.com/EN/Y2020/V32/I4/53
Fig.1  DIOR dataset size distribution chart
Fig.2-1  Image samples and instances from DIOR dataset
Fig.2-2  Image samples and instances from DIOR dataset
Fig.3  Classification efficiency of different backbone networks
Fig.4  Hybrid convolution unit
Fig.5  Two edge operations between layers
Fig.6  Intra-layer and inter-layer hybrid search principle
Fig.7  Influence of different convolution kernel size groups on search results
Fig.8  Main diagram of architecture search
模型 主干网络 mAP APS APL
YOLOv3[33] Darknet-53 53.1 22.5 75.7
Faster R-CNN[34] VGG16 54.2 33.9 78.6
Faster R-CNN+FPN ResNet-50 60.6 42.2 81.9
RetinaNet+FPN ResNet-50 62.9 43.5 86.5
CornerNet[35] Hourglass-104 57.3 31.8 80.4
Cascade R-CNN[36] ResNet-50 68.9 48.9 89.5
FCOS-FPN ResNet-50 59.6 35.7 85.9
NAS-FPN ResNet-50 64.8 44.8 84.6
NAS-FCOS[37] ResNet-50 60.8 40.3 79.0
NAS-Mix-FPN MixNet 70.5 51.0 89.1
Tab.1  Accuracy on DIOR test set by different architecture(%)
卷积搜索空间 金字塔
层数
mAP APS APL
{3×3, 5×5, 7×7, 9×9, 11×11} 6 71.2 53.2 89.7
{3×3, 5×5, 7×7, 9×9, 11×11} 5 71.6 53.3 90.1
{3×3, 5×5, 7×7, 9×9, 11×11} 4 70.1 51.9 88.7
{3×3, 5×5, 7×7, 11×11} 4 70.2 50.9 89.3
{3×3, 5×5, 7×7, 9×9} 5 69.9 49.8 87.2
{3×3, 5×5, 7×7, 9×9} 4 70.5 51.0 89.1
{3×3, 5×5, 7×7, 9×9} 3 68.7 45.7 88.3
{3×3, 5×5, 7×7} 4 64.5 40.3 83.9
{3×3, 5×5, 7×7} 3 62.7 41.1 84.5
{1×1, 3×3, 5×5} 3 58.4 38.8 80.6
Tab.2  Performance from search architectures in different convolutional groups on DIOR test set(%)
Fig.9  Remote sensing target detection of landmark buildings and facilities in Chongqing
[1] Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems. ACM, 2012(1):1097-1105.
[2] 张翠平, 苏光大. 人脸识别技术综述[J]. 中国图象图形学报, 2000,5(11):885-894.
[2] Zhang C P, Su G D. Overview of face recognition technology[J]. Journal of Image and Graphics, 2000,5(11):885-894.
[3] 张达峰, 刘宇红, 张荣芬. 基于深度学习的智能辅助驾驶系统[J]. 电子科技, 2018,31(10):64-67.
[3] Zhang D F, Liu Y H, Zhang R F. Intelligent driving assistance system based on deep learning[J]. Electronic Science and Technology, 2018,31(10):64-67.
[4] 李昊轩. 基于深度学习的医疗图像分割[J]. 电子制作, 2019,369(4):55-57.
[4] Li H X. Medical image segmentation based on deep learning[J]. Electronic Production, 2019,369(4):55-57.
[5] Lin T Y, Piotr D, Girshick R, et al. Feature pyramid networks for object detection[EB/OL]. (2017-04-19)[2019-12-24]. https://arxiv.org/abs/1612.03144.
url: https://arxiv.org/abs/1612.03144
[6] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:2818-2826.
[7] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[8] He K, Zhang X, Ren S, et al. Deep residual learning for image reco-gnition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:770-778.
[9] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convo-lutional networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017:4700-4708.
[10] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 2018.
[11] Kong T, Sun F, Yao A, et al. RON:Reverse connection with objectness prior networks for object detection[EB/OL].(2017-07-06)[2019-12-24]. https://arxiv.org/abs/1707.01691.
url: https://arxiv.org/abs/1707.01691
[12] Kim S W, Kook H K, Sun J Y, et al. Parallel feature pyramid network for object detection[C]// Proceedings of the European Conference on Computer Vision(ECCV), 2018:234-250.
[13] Tan M, Le Q V. Efficientnet:Rethinking model scaling for convolutional neural networks[EB/OL]. (2019-11-23)[2019-12-24]. https://arxiv.org/abs/1905.11946.
url: https://arxiv.org/abs/1905.11946
[14] Zoph B, Le Q V. Neural architecture search with reinforcement learning[EB/OL]. [2019-12-24]. https://arxiv.org/pdf/1611.01578.pdf.
url: https://arxiv.org/pdf/1611.01578.pdf
[15] Tan M, Chen B, Pang R, et al. Mnasnet:Platform-aware neural architecture search for mobile[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2019:2820-2828.
[16] Wu B, Dai X, Zhang P, et al. Fbnet:Hardware-aware efficient convnet design via differentiable neural architecture search[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2019:10734-10742.
[17] Tan M, Le Q V. MixNet:Mixed depthwise convolutional kernels[EB/OL]. (2019-12-01)[2019-12-24]. https://arxiv.org/abs/1907.09595.
url: https://arxiv.org/abs/1907.09595
[18] Deng J, Dong W, Socher R, et al. Imagenet:A large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009:248-255.
[19] Liu C, Zoph B, Neumann M, et al. Progressive neural architecture search[EB/OL]. (2018-07-26)[2019-12-24]. https://arxiv.org/abs/1712.00559.
url: https://arxiv.org/abs/1712.00559
[20] Ghiasi G, Lin T Y, Pang R, et al. NAS-FPN:Learning scalable feature pyramid architecture for object detection[EB/OL]. (2019-04-16)[2019-12-24]. https://arxiv.org/abs/1904.07392.
url: https://arxiv.org/abs/1904.07392
[21] Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images:A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2019,159(2):296-307.
[22] Howard A G, Zhu M, Chen B, et al. MobileNets:Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2019-12-24]. https://arxiv.org/abs/1704.04861.
url: https://arxiv.org/abs/1704.04861
[23] Rosenfeld A, Tsotsos J K. Incremental learning through deep adaptation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,42(3):651-663.
pmid: 30507526 url: https://www.ncbi.nlm.nih.gov/pubmed/30507526
[24] Howard A, Sandler M, Chu G, et al. Searching for MobileNetV3[J]. (2019-11-20)[2019-12-24]. https://arxiv.org/abs/1905.02244.
url: https://arxiv.org/abs/1905.02244
[25] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015,15523970.
[26] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]// International Conference on International Conference on Machine Learning, 2015.
[27] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4,Inception-ResNet and the impact of residual connections on learning[EB/OL]. (2016-08-23)[2019-12-24]. https://arxiv.org/abs/1602.07261.
url: https://arxiv.org/abs/1602.07261
[28] Chen Y, Li J, Xiao H, et al. Dual path networks[EB/OL]. (2017-08-01)[2019-12-24]. https://arxiv.org/abs/1707.01629.
url: https://arxiv.org/abs/1707.01629
[29] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017(99):2999-3007.
[30] Ramachandran P, Zoph B, Quoc V L. Searching for activation functions[EB/OL]. (2017-10-27)[2019-12-24]. https://arxiv.org/abs/1710.05941.
url: https://arxiv.org/abs/1710.05941
[31] Chen X L, Fang H, Lin T Y, et al. Microsoft COCO captions:Data collection and evaluation server[EB/OL]. (2015-04-03)[2019-12-24]. https://arxiv.org/abs/1504.00325.
url: https://arxiv.org/abs/1504.00325
[32] Everingham M, Gool L V, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010,88(2):303-338.
[33] Redmon J, Farhadi A. YOLOv3:An incremental improvement[J]. (2018-04-08)[2019-12-24]. https://arxiv.org/abs/1804.02767.
url: https://arxiv.org/abs/1804.02767
[34] Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[C]// Advances in Neural Information Processing Systems. 2015:91-99.
[35] Law H, Deng J. CornerNet:Detecting objects as paired keypoints[J]. International Journal of Computer Vision, 2020,128:642-656.
[36] Cai Z, Vasconcelos N. Cascade R-CNN:Delving into high quality object detection[EB/OL]. (2017-12-03)[2019-12-24]. https://arxiv.org/abs/1712.00726.
url: https://arxiv.org/abs/1712.00726
[37] Tian Z, Shen C, Chen H, et al. Fcos:Fully convolutional one-stage object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. IEEE, 2019:9627-9636.
[38] 金永涛, 杨秀峰, 高涛, 等. 基于面向对象与深度学习的典型地物提取[J]. 国土资源遥感, 2018,30(1):22-29.doi: 10.6046/gtzyyg.2018.01.04.
[38] Jin Y T, Yang X F, Gao T, et al. The typical object extraction method based on object-oriented and deep learning[J]. Remote Sensing for Land and Resources, 2018,30(1):22-29.doi: 10.6046/gtzyyg.2018.01.04.
[1] XUE Bai, WANG Yizhe, LIU Shuhan, YUE Mingyu, WANG Yiying, ZHAO Shihu. Change detection of high-resolution remote sensing images based on Siamese network[J]. Remote Sensing for Natural Resources, 2022, 34(1): 61-66.
[2] GUO Xiaozheng, YAO Yunjun, JIA Kun, ZHANG Xiaotong, ZHAO Xiang. Information extraction of Mars dunes based on U-Net[J]. Remote Sensing for Natural Resources, 2021, 33(4): 130-135.
[3] FENG Dongdong, ZHANG Zhihua, SHI Haoyue. Fine extraction of urban villages in provincial capitals based on multivariate data[J]. Remote Sensing for Natural Resources, 2021, 33(3): 272-278.
[4] WU Yu, ZHANG Jun, LI Yixu, HUANG Kangyu. Research on building cluster identification based on improved U-Net[J]. Remote Sensing for Land & Resources, 2021, 33(2): 48-54.
[5] LU Qi, QIN Jun, YAO Xuedong, WU Yanlan, ZHU Haochen. Buildings extraction of GF-2 remote sensing image based on multi-layer perception network[J]. Remote Sensing for Land & Resources, 2021, 33(2): 75-84.
[6] AN Jianjian, MENG Qingyan, HU Die, HU Xinli, YANG Jian, YANG Tianliang. The detection and determination of the working state of cooling tower in the thermal power plant based on Faster R-CNN[J]. Remote Sensing for Land & Resources, 2021, 33(2): 93-99.
[7] CAI Xiang, LI Qi, LUO Yan, QI Jiandong. Surface features extraction of mining area image based on object-oriented and deep-learning method[J]. Remote Sensing for Land & Resources, 2021, 33(1): 63-71.
[8] ZHENG Zhiteng, FAN Haisheng, WANG Jie, WU Yanlan, WANG Biao, HUANG Tengjie. An improved double-branch network method for intelligently extracting marine cage culture area[J]. Remote Sensing for Land & Resources, 2020, 32(4): 120-129.
[9] DU Fangzhou, SHI Yuli, SHENG Xia. Research on downscaling of TRMM precipitation products based on deep learning: Exemplified by northeast China[J]. Remote Sensing for Land & Resources, 2020, 32(4): 145-153.
[10] LIU Zhao, LIAO Feifan, ZHAO Tong. Remote sensing image urban built-up area extraction and optimization method based on PSPNet[J]. Remote Sensing for Land & Resources, 2020, 32(4): 84-89.
[11] CAI Zhiling, WENG Qian, YE Shaozhen, JIAN Cairen. Remote sensing image scene classification based on Inception-V3[J]. Remote Sensing for Land & Resources, 2020, 32(3): 80-89.
[12] LIU Yufang, ZOU Yarong, LIANG Chao. Progress in remote sensing detection of oil spill damping mechanism[J]. Remote Sensing for Land & Resources, 2020, 32(3): 1-7.
[13] Wenya LIU, Anzhi YUE, Jue JI, Weihua SHI, Ruru DENG, Yeheng LIANG, Longhai XIONG. Urban green space extraction from GF-2 remote sensing image based on DeepLabv3+ semantic segmentation model[J]. Remote Sensing for Land & Resources, 2020, 32(2): 120-129.
[14] Lijing BU, Xiuwei LI, Zhengpeng ZHANG, Haonan JIANG. Feasibility of conditional generation adversarial network in remote sensing image restoration[J]. Remote Sensing for Land & Resources, 2020, 32(1): 27-34.
[15] Bo YU, Junjun ZHANG, Chungeng LI, Jubai AN. Automated extraction of roads from mobile laser scanning point clouds by image semantic segmentation[J]. Remote Sensing for Land & Resources, 2020, 32(1): 66-74.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
京ICP备05055290号-2
Copyright © 2017 Remote Sensing for Natural Resources
Support by Beijing Magtech