基于Unet网络多任务学习的遥感图像建筑地物语义分割

doi:10.6046/gtzyyg.2020.04.11

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(6948 KB)

HTML
输出: BibTeX | EndNote (RIS)

摘要

为准确分割出高分辨率遥感图像中的建筑地物,提出一种基于Unet网络多任务学习的建筑地物语义分割方法。首先,根据遥感图像建筑地物真值图生成边界距离图,并将该遥感图像及其真值图共同作为Unet网络的输入; 然后,在基于ResNet网络构建的Unet网络末端加入建筑地物预测层与边界距离预测层,搭建多任务网络; 最后,定义多任务网络的损失函数,并使用Adam优化算法训练该网络。在Inria航空遥感图像建筑地物标注数据集上进行实验,结果表明,与全卷积网络结合多层感知器方法相比,VGG16网络、VGG16+边界预测、ResNet50和本文方法的交并比值分别提升5.15,6.94,6.41和7.86百分点,准确度分别提升至94.71%,95.39%,95.30%和96.10%,可实现高精度的建筑地物提取。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘尚旺
	崔智勇
	李道义

关键词 ： Unet网络, 多任务学习, 遥感图像, 语义分割, ResNet网络

Abstract：

In order to accurately segment the building object of high-resolution remote sensing image, this paper proposes a multi-task learning method based on Unet network. Firstly, boundary distance map is generated from the ground-truth map of the building object remote sensing image; the boundary distance map, original remote sensing image and ground-truth map together are regarded as the input of Unet network. Then, based on the ResNet network, a multi-task network is built by adding the building object prediction layer and the boundary distance prediction layer at the end of the Unet network. Finally, the loss function of the multi-task network is defined, and the network is trained by using Adam optimization algorithm. Experiments on the Inria aerial remote sensing image building object dataset show that, compared with the full convolutional network combined with the multi-layer perceptron method, the intersection-over-unions of VGG16, VGG16+boundary prediction, ResNet50 and this method have been increased by 5.15, 6.94, 6.41, and 7.86 percentage points, and the accuracy has been increased to 94.71%, 95.39%, 95.30%, and 96.10% respectively,which ensures that the building object of high-resolution remote sensing image can be segmented effectively.

Key words： Unet network multi-task learning remote sensing image semantic segmentation ResNet network

收稿日期: 2019-11-14 出版日期: 2020-12-23

TP751.1

基金资助:河南省科技攻关项目“物联网智能视频图像感知技术研究”(192102210290);河南省高等学校重点科研项目“物联网感知中快速语义图像分割方法研究”(15A520080)

作者简介: 刘尚旺(1973-),男,副教授,博士,主要研究方向为计算机视觉、图像处理。Email:shwl08@126.com。

引用本文:

刘尚旺, 崔智勇, 李道义. 基于Unet网络多任务学习的遥感图像建筑地物语义分割[J]. 国土资源遥感, 2020, 32(4): 74-83.
LIU Shangwang, CUI Zhiyong, LI Daoyi. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network. Remote Sensing for Land & Resources, 2020, 32(4): 74-83.

链接本文:

https://www.gtzyyg.com/CN/10.6046/gtzyyg.2020.04.11 或 https://www.gtzyyg.com/CN/Y2020/V32/I4/74

Tab.1 训练数据可视化

Fig.1 多任务网络结构

Tab.2 不同方法的实验结果

Tab.3 不同方法遥感图像建筑地物分割结果

Tab.4 边界距离预测层输出结果可视化

Fig.2 训练周期与损失值折线图

Fig.3 不同方法的实际遥感图像建筑地物分割结果

[1]	Zhang B, Wang C, Shen Y, et al. Fully connected conditional random fields for high-resolution remote sensing land use/land cover classification with convolutional neural networks[J]. Remote Sensing, 2018,10(12):1889-1903.
[2]	Li W, Dong R, Fu H. Large-scale oil palm tree detection from high-resolution satellite images using two-stage convolutional neural networks[J]. Remote Sensing, 2019,11(1):11-31.
[3]	张永宏, 夏广浩, 阚希, 等. 基于全卷积神经网络的多源高分辨率遥感道路提取[J]. 计算机应用, 2018,28(7):2070-2075.
	Zhang Y H, Xia G H, Kan X, et al. Road extraction from multi-source high resolution remote sensing image based on fully convolutional neural network[J]. Journal of Computer Applications, 2018,28(7):2070-2075.
[4]	Demir I, Koperski K, Lindenvaum D, et al. DeepGlobe 2018:A challenge to parse the earth through satellite images[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).Salt Lake City:IEEE, 2018:17201-17209.
[5]	Li L, Liang J, Weng M, et al. A multiple-feature reuse network to extract buildings from remote sensing imagery[J]. Remote Sensing, 2018,10(9):1350-1368. doi: 10.3390/rs10091350
[6]	施文灶, 刘金清. 基于邻域总变分和势直方图函数的高分辨率遥感影像建筑物提取[J]. 计算机应用, 2017,37(6):1787-1792.
	Shi W Z, Liu J Q. Building extraction from high-resolution remotely sensed imagery based on neighborhood total variation and potential histogram function[J]. Journal of Computer Applications, 2017,37(6):1787-1792.
[7]	Sun X, Lin X, Shen S, et al. High-resolution remote sensing data classification over urban areas using random forest ensemble and fully connected conditional random field[J]. ISPRS International Journal of Geo-Information, 2017,6(8):245-271.
[8]	Jabri S, Zhang Y, Suliman A. Stereo-based building detection in very high resolution satellite imagery using IHS color system[C]// 2014 IEEE Geoscience and Remote Sensing Symposium.Quebec City:IEEE, 2014:2301-2304.
[9]	Garcia-Garcia A, Orts-Escolano S, Oprea S, et al. A review on deep learning techniques applied to semantic segmentation[EB/OL]. (2017-04-22) [2019-02-05]. http://arxiv.org/abs/1704.06857.
[10]	Yuan J. Automatic building extraction in aerial scenes using convolutional networks.[EB/OL]. (2016-02-21) [2019-02-20]. http://arxiv.org/abs/1602.06564.
[11]	Zhang Q, Wang Y, Liu Q, et al. CNN based suburban building detection using monocular high resolution Google Earth images[C]// 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).Beijing:IEEE, 2016:661-664.
[12]	Zhou B, Zhao H, Puig X, et al. Scene parsing through ade20k dataset[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii, 2017:633-641.
[13]	Long J, Shelhaner E, Darrell T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston, 2015:3431-3440.
[14]	Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[15]	Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[EB/OL]. (2016-04-30)[2019-03-25]. http://arxiv.org/abs/1511.07122.
[16]	Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-12-05)[2019-03-26]. http://arxiv.org/abs/1706.05587.
[17]	Maggiori E, Tarabalka Y, Charpiat G, et al. Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark[C]// 2017 IEEE International Geoscience and Remote Sensing Symposium(IGARSS).Fort Worth:IEEE, 2017:3226-3229.
[18]	Maggiori E, Tarabalka Y, Charpiat G, et al. High-resolution semantic labeling with convolutional neural networks[EB/OL]. (2016-11-07)[2019-03-26]. http://arxiv.org/abs/1611.01962.
[19]	Marmanis D, Schindler K, Wegner J D, et al. Classification with an edge:Improving semantic image segmentation with boundary detection[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018,135:158-172.
[20]	Peng C, Zhang X, Yu G, et al. Large kernel matters:Improve semantic segmentation by global convolutional network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii, 2017:4353-4361.
[21]	Huang Z, Cheng G, Wang H, et al. Building extraction from multi-source remote sensing images via deep deconvolution neural networks[C]// 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).Beijing:IEEE, 2016:1835-1838.
[22]	Bischke B, Helber P, Folz J, et al. Multi-task learning for segmentation of building footprints with deep neural networks[EB/OL]. (2017-09-18) [2019-03-26]. http://arxiv.org/abs/1709.05932.
[23]	Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,Cham, 2015:234-241.
[24]	Iglovikov V, Shvets A. Ternausnet:U-net with VGG11 encoder pre-trained on ImageNet for image segmentation[EB/OL]. (2018-03-29) [2019-03-29]. http://arxiv.org/abs/1801.05746.
[25]	Xu Y, Wu L, Xie Z, et al. Building extraction in very high resolution remote sensing imagery using deep learning and guided filters[J]. Remote Sensing, 2018,10(1):144-162.
[26]	He K, Zhang X, Ren S, et al. Deep residual learning for image reco-gnition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.San Francisco, 2016:770-778.
[27]	Kendall A, Gal Y, Cipolla R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Reco-gnition.Salt Lake City, 2018:7482-7491.
[28]	Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015,115(3):211-252.
[29]	Hayder Z, He X, Salzmann M. Boundary-aware instance segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii, 2017:5696-5704.
[30]	Kingma D P, Ba J. Adam:A method for stochastic optimization[EB/OL]. (2017-01-30) [2019-04-11]. http://arxiv.org/abs/1412.6980.

[1]	张大明, 张学勇, 李璐, 刘华勇. 一种超像素上Parzen窗密度估计的遥感图像分割方法[J]. 自然资源遥感, 2022, 34(1): 53-60.
[2]	李轶鲲, 杨洋, 杨树文, 王子浩. 耦合模糊C均值聚类和贝叶斯网络的遥感影像后验概率空间变化向量分析[J]. 自然资源遥感, 2021, 33(4): 82-88.
[3]	刘万军, 高健康, 曲海成, 姜文涛. 多尺度特征增强的遥感图像舰船目标检测[J]. 自然资源遥感, 2021, 33(3): 97-106.
[4]	郭文, 张荞. 基于注意力增强全卷积神经网络的高分卫星影像建筑物提取[J]. 国土资源遥感, 2021, 33(2): 100-107.
[5]	仇一帆, 柴登峰. 无人工标注数据的Landsat影像云检测深度学习方法[J]. 国土资源遥感, 2021, 33(1): 102-107.
[6]	刘钊, 赵桐, 廖斐凡, 李帅, 李海洋. 基于语义分割网络的高分遥感影像城市建成区提取方法研究与对比分析[J]. 国土资源遥感, 2021, 33(1): 45-53.
[7]	蔡祥, 李琦, 罗言, 齐建东. 面向对象结合深度学习方法的矿区地物提取[J]. 国土资源遥感, 2021, 33(1): 63-71.
[8]	王小兵. 融合提升小波阈值与多方向边缘检测的矿区遥感图像去噪[J]. 国土资源遥感, 2020, 32(4): 46-52.
[9]	刘钊, 廖斐凡, 赵桐. 基于PSPNet的遥感影像城市建成区提取及其优化方法[J]. 国土资源遥感, 2020, 32(4): 84-89.
[10]	蔡之灵, 翁谦, 叶少珍, 简彩仁. 基于Inception-V3模型的高分遥感影像场景分类[J]. 国土资源遥感, 2020, 32(3): 80-89.
[11]	李宇, 肖春姣, 张洪群, 李湘眷, 陈俊. 深度卷积融合条件随机场的遥感图像语义分割[J]. 国土资源遥感, 2020, 32(3): 15-22.
[12]	刘文雅, 岳安志, 季珏, 师卫华, 邓孺孺, 梁业恒, 熊龙海. 基于DeepLabv3+语义分割模型的GF-2影像城市绿地提取[J]. 国土资源遥感, 2020, 32(2): 120-129.
[13]	于博, 张军军, 李春庚, 安居白. 图像语义分割辅助的车载激光点云道路提取方法[J]. 国土资源遥感, 2020, 32(1): 66-74.
[14]	叶发茂, 罗威, 苏燕飞, 赵旭青, 肖慧, 闵卫东. 卷积神经网络特征在遥感图像配准中的应用[J]. 国土资源遥感, 2019, 31(2): 32-37.
[15]	谢奇芳, 姚国清, 张猛. 基于Faster R-CNN的高分辨率图像目标检测技术[J]. 国土资源遥感, 2019, 31(2): 38-43.

Viewed

Full text

Abstract

Cited

Shared

Discussed