面向遥感目标检测的多尺度架构搜索方法

doi:10.6046/gtzyyg.2020.04.08

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(4629 KB)

HTML
输出: BibTeX | EndNote (RIS)

摘要

随着“智慧城市”概念的提出,遥感目标检测逐渐成为城镇的规划、建设和维护的重要方式。为了表征不同城市的差异化遥感特征,解决模型在不同尺度物体上泛化能力不均的问题,提出了一种基于混合分离卷积的金字塔架构搜索方法。首先,分析了遥感图像数据集的空间分布特征; 然后,针对其特点构建多感受野混合卷积的搜索空间,进而训练其子网络内的权重; 同时借助强化学习算法针对收敛的损失值序列,循环搜索特征提取单元的数量和结构; 最终,当架构奖励函数稳定时,固定相应的架构参数和权重矩阵,从而在测试数据上可以自适应融合图像的跨尺度信息,提高同类目标在不同分辨率下的定位精度。该方法搜索出的网络在DIOR遥感数据集上的平均准确率为78.6%,比CornerNet高6百分点,比Cascade R-CNN高1.6百分点,其中小物体准确率比Cascade R-CNN高2.1百分点,并证实了多尺度架构搜索在遥感目标检测的优化能力。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	裴婵
	廖铁军

关键词 ：深度学习, 遥感检测, 架构搜索, 图像金字塔, 召回率

Abstract：

With the concept of “smart city”, remote sensing target detection has gradually become an important way for town planning, construction and maintenance. In order to characterize the differentiating remote sensing features of different cities and solve the problem of uneven generalization ability in the model on objects of different scales, this paper proposes a pyramid structure search method based on hybrid separation and convolution. Firstly, this paper analyzes the spatial distribution characteristics of the remote sensing image dataset, also constructs a multi-receptive field hybrid convolution search space based on its characteristics, and then trains the weights in its sub-network. Secondly, the number and structure of feature extraction units are searched cyclically with the help of reinforcement learning algorithms for the convergent loss value sequence. Finally, when the architecture reward function is stable, the corresponding architecture parameters and weight matrix are fixed, so that the cross-scale information of the image can be adaptively fused on the test data to improve the positioning accuracy of similar targets at different resolutions. The average accuracy of the network searched by this method on the DIOR remote sensing dataset is 78.6%, which is 6 percentage points higher than that of CornerNet, 1.6 percentage points higher than that of Cascade R-CNN, and the accuracy of small objects is 2.1 percentage points higher than that of Cascade R-CNN. The optimization ability of multi-scale architecture search in remote sensing target detection was confirmed.

Key words： deep learning remote sensing detection architectural search image pyramid recall

收稿日期: 2019-12-24 出版日期: 2020-12-23

:	TP183
	TP751

基金资助:教育部人文规划基金项目“三峡库区农业适度规模经管研究”(15XJA790002);国家自然科学基金重点项目“土壤中的电场——量子涨落耦合作用”(41530855)

通讯作者: 廖铁军

作者简介: 裴婵(1996-),女,硕士研究生,主要研究方向为土地资源信息管理。Email:1371566711@qq.com。

引用本文:

裴婵, 廖铁军. 面向遥感目标检测的多尺度架构搜索方法[J]. 国土资源遥感, 2020, 32(4): 53-60.
PEI Chan, LIAO Tiejun. Multi-scale architecture search method for remote sensing object detection. Remote Sensing for Land & Resources, 2020, 32(4): 53-60.

链接本文:

https://www.gtzyyg.com/CN/10.6046/gtzyyg.2020.04.08 或 https://www.gtzyyg.com/CN/Y2020/V32/I4/53

Fig.1 DIOR数据集中不同类的尺寸分布

Fig.2-1 DIOR数据集部分图像与类别实例

Fig.2-2 DIOR数据集部分图像与类别实例

Fig.3 不同主干网络的分类效率^{[25,26,27,28]}

Fig.4 混合卷积单元

Fig.5 2种层间边缘操作

Fig.6 层内和层间混合搜索原理

Fig.7 不同卷积核尺寸组对搜索结果的影响

Fig.8 架构搜索主要结果示意图

Tab.1 不同网络在DIOR测试集下的准确率

Tab.2 不同卷积搜索空间下的搜索结构在DIOR测试集上的性能

Fig.9 重庆市地标建筑、设施遥感目标检测

[1]	Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems. ACM, 2012(1):1097-1105.
[2]	张翠平, 苏光大. 人脸识别技术综述[J]. 中国图象图形学报, 2000,5(11):885-894.
	Zhang C P, Su G D. Overview of face recognition technology[J]. Journal of Image and Graphics, 2000,5(11):885-894.
[3]	张达峰, 刘宇红, 张荣芬. 基于深度学习的智能辅助驾驶系统[J]. 电子科技, 2018,31(10):64-67.
	Zhang D F, Liu Y H, Zhang R F. Intelligent driving assistance system based on deep learning[J]. Electronic Science and Technology, 2018,31(10):64-67.
[4]	李昊轩. 基于深度学习的医疗图像分割[J]. 电子制作, 2019,369(4):55-57.
	Li H X. Medical image segmentation based on deep learning[J]. Electronic Production, 2019,369(4):55-57.
[5]	Lin T Y, Piotr D, Girshick R, et al. Feature pyramid networks for object detection[EB/OL]. (2017-04-19)[2019-12-24]. https://arxiv.org/abs/1612.03144.
[6]	Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:2818-2826.
[7]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[8]	He K, Zhang X, Ren S, et al. Deep residual learning for image reco-gnition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:770-778.
[9]	Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convo-lutional networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017:4700-4708.
[10]	Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 2018.
[11]	Kong T, Sun F, Yao A, et al. RON:Reverse connection with objectness prior networks for object detection[EB/OL].(2017-07-06)[2019-12-24]. https://arxiv.org/abs/1707.01691.
[12]	Kim S W, Kook H K, Sun J Y, et al. Parallel feature pyramid network for object detection[C]// Proceedings of the European Conference on Computer Vision(ECCV), 2018:234-250.
[13]	Tan M, Le Q V. Efficientnet:Rethinking model scaling for convolutional neural networks[EB/OL]. (2019-11-23)[2019-12-24]. https://arxiv.org/abs/1905.11946.
[14]	Zoph B, Le Q V. Neural architecture search with reinforcement learning[EB/OL]. [2019-12-24]. https://arxiv.org/pdf/1611.01578.pdf.
[15]	Tan M, Chen B, Pang R, et al. Mnasnet:Platform-aware neural architecture search for mobile[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2019:2820-2828.
[16]	Wu B, Dai X, Zhang P, et al. Fbnet:Hardware-aware efficient convnet design via differentiable neural architecture search[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2019:10734-10742.
[17]	Tan M, Le Q V. MixNet:Mixed depthwise convolutional kernels[EB/OL]. (2019-12-01)[2019-12-24]. https://arxiv.org/abs/1907.09595.
[18]	Deng J, Dong W, Socher R, et al. Imagenet:A large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009:248-255.
[19]	Liu C, Zoph B, Neumann M, et al. Progressive neural architecture search[EB/OL]. (2018-07-26)[2019-12-24]. https://arxiv.org/abs/1712.00559.
[20]	Ghiasi G, Lin T Y, Pang R, et al. NAS-FPN:Learning scalable feature pyramid architecture for object detection[EB/OL]. (2019-04-16)[2019-12-24]. https://arxiv.org/abs/1904.07392.
[21]	Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images:A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2019,159(2):296-307.
[22]	Howard A G, Zhu M, Chen B, et al. MobileNets:Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2019-12-24]. https://arxiv.org/abs/1704.04861.
[23]	Rosenfeld A, Tsotsos J K. Incremental learning through deep adaptation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,42(3):651-663. pmid: 30507526
[24]	Howard A, Sandler M, Chu G, et al. Searching for MobileNetV3[J]. (2019-11-20)[2019-12-24]. https://arxiv.org/abs/1905.02244.
[25]	Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015,15523970.
[26]	Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]// International Conference on International Conference on Machine Learning, 2015.
[27]	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4,Inception-ResNet and the impact of residual connections on learning[EB/OL]. (2016-08-23)[2019-12-24]. https://arxiv.org/abs/1602.07261.
[28]	Chen Y, Li J, Xiao H, et al. Dual path networks[EB/OL]. (2017-08-01)[2019-12-24]. https://arxiv.org/abs/1707.01629.
[29]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017(99):2999-3007.
[30]	Ramachandran P, Zoph B, Quoc V L. Searching for activation functions[EB/OL]. (2017-10-27)[2019-12-24]. https://arxiv.org/abs/1710.05941.
[31]	Chen X L, Fang H, Lin T Y, et al. Microsoft COCO captions:Data collection and evaluation server[EB/OL]. (2015-04-03)[2019-12-24]. https://arxiv.org/abs/1504.00325.
[32]	Everingham M, Gool L V, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010,88(2):303-338.
[33]	Redmon J, Farhadi A. YOLOv3:An incremental improvement[J]. (2018-04-08)[2019-12-24]. https://arxiv.org/abs/1804.02767.
[34]	Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[C]// Advances in Neural Information Processing Systems. 2015:91-99.
[35]	Law H, Deng J. CornerNet:Detecting objects as paired keypoints[J]. International Journal of Computer Vision, 2020,128:642-656.
[36]	Cai Z, Vasconcelos N. Cascade R-CNN:Delving into high quality object detection[EB/OL]. (2017-12-03)[2019-12-24]. https://arxiv.org/abs/1712.00726.
[37]	Tian Z, Shen C, Chen H, et al. Fcos:Fully convolutional one-stage object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. IEEE, 2019:9627-9636.
[38]	金永涛, 杨秀峰, 高涛, 等. 基于面向对象与深度学习的典型地物提取[J]. 国土资源遥感, 2018,30(1):22-29.doi: 10.6046/gtzyyg.2018.01.04.
	Jin Y T, Yang X F, Gao T, et al. The typical object extraction method based on object-oriented and deep learning[J]. Remote Sensing for Land and Resources, 2018,30(1):22-29.doi: 10.6046/gtzyyg.2018.01.04.

[1]	薛白, 王懿哲, 刘书含, 岳明宇, 王艺颖, 赵世湖. 基于孪生注意力网络的高分辨率遥感影像变化检测[J]. 自然资源遥感, 2022, 34(1): 61-66.
[2]	郭晓征, 姚云军, 贾坤, 张晓通, 赵祥. 基于U-Net深度学习方法火星沙丘提取研究[J]. 自然资源遥感, 2021, 33(4): 130-135.
[3]	冯东东, 张志华, 石浩月. 基于多元数据的省会城市城中村精细提取[J]. 自然资源遥感, 2021, 33(3): 272-278.
[4]	武宇, 张俊, 李屹旭, 黄康钰. 基于改进U-Net的建筑物集群识别研究[J]. 国土资源遥感, 2021, 33(2): 48-54.
[5]	卢麒, 秦军, 姚雪东, 吴艳兰, 朱皓辰. 基于多层次感知网络的GF-2遥感影像建筑物提取[J]. 国土资源遥感, 2021, 33(2): 75-84.
[6]	安健健, 孟庆岩, 胡蝶, 胡新礼, 杨健, 杨天梁. 基于Faster R-CNN的火电厂冷却塔检测及工作状态判定[J]. 国土资源遥感, 2021, 33(2): 93-99.
[7]	蔡祥, 李琦, 罗言, 齐建东. 面向对象结合深度学习方法的矿区地物提取[J]. 国土资源遥感, 2021, 33(1): 63-71.
[8]	刘钊, 廖斐凡, 赵桐. 基于PSPNet的遥感影像城市建成区提取及其优化方法[J]. 国土资源遥感, 2020, 32(4): 84-89.
[9]	郑智腾, 范海生, 王洁, 吴艳兰, 王彪, 黄腾杰. 改进型双支网络模型的遥感海水网箱养殖区智能提取方法[J]. 国土资源遥感, 2020, 32(4): 120-129.
[10]	杜方洲, 石玉立, 盛夏. 基于深度学习的TRMM降水产品降尺度研究——以中国东北地区为例[J]. 国土资源遥感, 2020, 32(4): 145-153.
[11]	蔡之灵, 翁谦, 叶少珍, 简彩仁. 基于Inception-V3模型的高分遥感影像场景分类[J]. 国土资源遥感, 2020, 32(3): 80-89.
[12]	刘文雅, 岳安志, 季珏, 师卫华, 邓孺孺, 梁业恒, 熊龙海. 基于DeepLabv3+语义分割模型的GF-2影像城市绿地提取[J]. 国土资源遥感, 2020, 32(2): 120-129.
[13]	卜丽静, 李秀伟, 张正鹏, 姜昊男. 条件生成对抗网络在遥感图像复原中的可行性[J]. 国土资源遥感, 2020, 32(1): 27-34.
[14]	于博, 张军军, 李春庚, 安居白. 图像语义分割辅助的车载激光点云道路提取方法[J]. 国土资源遥感, 2020, 32(1): 66-74.
[15]	吴海平, 黄世存. 基于深度学习的新增建设用地信息提取试验研究——全国土地利用遥感监测工程创新探索[J]. 国土资源遥感, 2019, 31(4): 159-166.

Viewed

Full text

Abstract

Cited

Shared

Discussed