Please wait a minute...
 
Remote Sensing for Land & Resources    2021, Vol. 33 Issue (2) : 100-107     DOI: 10.6046/gtzyyg.2020230
|
Building extraction using high-resolution satellite imagery based on an attention enhanced full convolution neural network
GUO Wen1(), ZHANG Qiao2()
1. The Third Institute of Photogrammetry and Remote Sensing, Ministry of Natural Resources, Chengdu 610100, China
2. School of Geoscience and Technology, Southwest Petroleum University, Chengdu 610500, China
Download: PDF(8342 KB)   HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  

Automatic extraction of buildings from satellite remote sensing images has a wide range of applications in the development of economy and society. Due to the influence of mutual occlusion, illumination, background environment and other factors in satellite remote sensing images, it is difficult for traditional methods to achieve high-precision building extraction. This paper proposes an attention enhanced feature pyramid network (FPN-SENet) and constructs a large-scale pixel-wise building dataset (SCRS dataset) by using multi-source high-resolution satellite images and vector data to realize the automatic extraction of buildings from multi-source satellite images, and compares it with the other full convolution neural networks. The results show that the accuracy of building extracted from SCRS dataset is close to the world’s leading open source satellite image dataset, and the accuracy of Pseudo color data is higher than that of true color data The accuracy of FPN-SENet is better than that of other full convolution neural networks. The extraction of building can also be improved by using the sum of cross entropy and Dice coefficient as the loss function. The overall accuracy of the best classification model is 95.2%, Kappa coefficient is 79.0%, and F1-score and IoU are 81.7% and 69.1% respectively. This study can provide a reference for building automatic extraction from high-resolution satellite images.

Keywords Chinese high-resolution satellite imagery      buildings      semantic segmentation      attention enhancement     
ZTFLH:  TP751P237  
Corresponding Authors: ZHANG Qiao     E-mail: 451362006@qq.com;scrs_qiaozh@163.com
Issue Date: 21 July 2021
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Wen GUO
Qiao ZHANG
Cite this article:   
Wen GUO,Qiao ZHANG. Building extraction using high-resolution satellite imagery based on an attention enhanced full convolution neural network[J]. Remote Sensing for Land & Resources, 2021, 33(2): 100-107.
URL:  
https://www.gtzyyg.com/EN/10.6046/gtzyyg.2020230     OR     https://www.gtzyyg.com/EN/Y2021/V33/I2/100
Fig.1  Structure diagram of FPN-SENet
Fig.2  Squeeze-and-excitation block
Fig.3  Spline window function
Fig.4  Examples of dataset with satellite image and label
Fig.5  Examples of path generalization
Fig.6  Test images and ground-truth of building
数据集 OA Kappa recall precision F1/% IoU/
%
WHU 0.995 0.804 0.772 0.845 80.7 67.6
SCRS(真彩色) 0.946 0.751 0.742 0.827 78.2 64.2
SCRS(假彩色) 0.952 0.784 0.778 0.847 81.1 68.2
Tab.1  Comparison of the SCRS dataset with the WHU dataset
方法 OA Kappa recall precision F1/% IoU/
%
FCN-8s 0.932 0.672 0.639 0.800 71.0 55.1
Segnet 0.934 0.681 0.645 0.809 71.8 56.0
U-net 0.941 0.735 0.707 0.818 75.8 61.1
PSPNet 0.936 0.689 0.645 0.827 72.5 56.8
FPN-
SENet
0.946 0.751 0.742 0.827 78.2 64.2
Tab.2  Comparison of the FPN-SENET with other networks
损失函数 OA Kappa recall precision F1/% IoU/
%
L_bce 0.952 0.784 0.778 0.847 81.1 68.2
L_dice 0.948 0.779 0.833 0.786 80.9 67.9
L_bce-dice 0.952 0.790 0.82 0.814 81.7 69.1
Tab.3  Comparison of results from models trained with different loss functions
Fig.7  Results with smooth prediction
Fig.8  The classified result of test images
[1] Blaschke T. Object based image analysis for remote sensing[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2010, 65(1):2-16.
doi: 10.1016/j.isprsjprs.2009.06.004 url: https://linkinghub.elsevier.com/retrieve/pii/S0924271609000884
[2] Yang Y, Newsam S. Geographic image retrieval using local invariant features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2013, 51(2):818-832.
doi: 10.1109/TGRS.2012.2205158 url: http://ieeexplore.ieee.org/document/6257473/
[3] Li E, Femiani J, Xu S B, et al. Robust rooftop extraction from visible band images using higher order CRF[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(8):4483-4495.
doi: 10.1109/TGRS.2015.2400462 url: http://ieeexplore.ieee.org/document/7047875/
[4] Melgani F, Bruzzone L. Classification of hyperspectral remote sensing images with support vector machines[J]. IEEE Transactions on Geoscience and Remote Sensing, 2004, 42(8):1778-1790.
doi: 10.1109/TGRS.2004.831865 url: http://ieeexplore.ieee.org/document/1323134/
[5] Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18:1527-1554.
doi: 10.1162/neco.2006.18.7.1527 url: https://direct.mit.edu/neco/article/18/7/1527-1554/7065
[6] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[C]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[7] Graves A, Liwicki M, Fernandez S, et al. A novel connectionist system for unconstrained handwriting recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31:855-868.
doi: 10.1109/TPAMI.2008.137 pmid: 19299860
[8] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems, 2012.
[9] Mikolov T, Deoras A, Kombrink S, et al. Empirical evaluation and combination of advanced language modeling techniques[C]// Florence:Conference of the International Speech Communication Association, 2011.
[10] Dahl G E, Yu D, Deng L, et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition[J]. IEEE Transactions on Audio, Speech and Language Processing, 2012, 20(1):30-42.
doi: 10.1109/TASL.2011.2134090 url: http://ieeexplore.ieee.org/document/5740583/
[11] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition,Boston,MA,USA, 2015.
[12] Wu G M, Shao X W, Guo Z L, et al. Automatic building segmentation of aerial imagery using multi-constraint fully convolutional networks[J]. Remote Sensing, 2018, 10(3):407-424.
doi: 10.3390/rs10030407 url: http://www.mdpi.com/2072-4292/10/3/407
[13] Zhang W K, Huang H, Schmitz M, et al. Effective fusion of multi-modal remote sensing data in a fully convolutional network for semantic labeling[J]. Remote Sensing, 2018, 10(52):1-14.
doi: 10.3390/rs10010001 url: http://www.mdpi.com/2072-4292/10/1/1
[14] Badrinarayanan V, Kendall A, Cipolla R. Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(12):2481-2495.
doi: 10.1109/TPAMI.34 url: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34
[15] Ronneberger O, Fischer P, Brox T. U-Net:Convolutional networks for biomedical image segmentation[C]// Medical Image Computing and Computer-Assisted Intervention, 2015.
[16] Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[17] Chen L C, Papandreou G, Kokkinos I, et al. Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018, 40:834-848.
[18] Mnih V. Machine learning for aerial image labeling[M]. Toronto:Toronto University of Toronto, 2013.
[19] Maggiori E, Tarabalka Y, Charpiat G, et al. Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark[C]// IEEE International Geoscience and Remote Sensing Symposium(IGARSS),Fort Worth,United States, 2017.
[20] 季顺平, 魏世清. 遥感影像建筑物提取的卷积神经元网络与开源数据集方法[J]. 测绘学报, 2019, 48(4):448-459.
[20] Ji S P, Wei S Q. Building extraction via convolutional neural networks from an open remote sensing building dataset[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(4):448-459.
[21] Lin S Y, Doll'ar P, Girshick R, et al. Feature pyramid networks for object detection[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[22] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[23] 杨建宇, 周振旭, 杜贞容, 等. 基于segnet语义模型的高分辨率遥感影像农村建设用地提取[J]. 农业工程学报, 2019, 35(5):251-258.
[23] Yang J Y, Zhou Z X, Du Z R, et al. Rural construction land extraction from high spatial resolution remote sensing image based on segnet semantic segmentation model[J]. Transactions of the Chinese Society of Agricultural Engineering, 2019, 35(5):251-258.
[24] 国务院第一次全国地理国情普查领导小组办公室. 地理国情普查内容与指标[M]. 北京: 测绘出版社, 2013.
[24] The First National Geographic National Conditions Census Leading Group Office of State Council. Geographic national conditions census content and indicator[M]. Beijing: Surveying and Mapping Press, 2013.
[25] Congalton R G. A review of assessing the accuracy of classifications of remotely sensed data[J]. Remote Sensing of Environment, 1991, 37(1):35-46.
doi: 10.1016/0034-4257(91)90048-B url: https://linkinghub.elsevier.com/retrieve/pii/003442579190048B
[26] Maggiori E, Tarabalka Y, Charpiat G, et al. Convolutional neural networks for large-scale remote-sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55:645-657.
doi: 10.1109/TGRS.2016.2612821 url: http://ieeexplore.ieee.org/document/7592858/
[1] LIN Jiahui, LIU Guang, FAN Jinghui, ZHAO Hongli, BAI Shibiao, PAN Hongyu. Extracting information about mining subsidence by combining an improved U-Net model and D-InSAR[J]. Remote Sensing for Natural Resources, 2023, 35(3): 145-152.
[2] LIU Li, DONG Xianmin, LIU Juan. A performance evaluation method for semantic segmentation models of remote sensing images considering surface features[J]. Remote Sensing for Natural Resources, 2023, 35(3): 80-87.
[3] ZHAO Linghu, YUAN Xiping, GAN Shu, HU Lin, QIU Mingyu. An information extraction model of roads from high-resolution remote sensing images based on improved Deeplabv3+[J]. Remote Sensing for Natural Resources, 2023, 35(1): 107-114.
[4] MENG Congtang, ZHAO Yindi, HAN Wenquan, HE Chenyang, CHEN Xiqiu. RandLA-Net-based detection of urban building change using airborne LiDAR point clouds[J]. Remote Sensing for Natural Resources, 2022, 34(4): 113-121.
[5] SHEN Jun’ao, MA Mengting, SONG Zhiyuan, LIU Tingzhou, ZHANG Wei. Water information extraction from high-resolution remote sensing images using the deep-learning based semantic segmentation model[J]. Remote Sensing for Natural Resources, 2022, 34(4): 129-135.
[6] WANG Huajun, GE Xiaosan. Lightweight DeepLabv3+ building extraction method from remote sensing images[J]. Remote Sensing for Natural Resources, 2022, 34(2): 128-135.
[7] LIAO Kuo, NIE Lei, YANG Zeyu, ZHANG Hongyan, WANG Yanjie, PENG Jida, DANG Haofei, LENG Wei. Classification of tea garden based on multi-source high-resolution satellite images using multi-dimensional convolutional neural network[J]. Remote Sensing for Natural Resources, 2022, 34(2): 152-161.
[8] SONG Renbo, ZHU Yuxin, GUO Renjie, ZHAO Pengfei, ZHAO Kexin, ZHU Jie, CHEN Ying. A method for 3D modeling of urban buildings based on multi-source data integration[J]. Remote Sensing for Natural Resources, 2022, 34(1): 93-105.
[9] QIU Yifan, CHAI Dengfeng. A deep learning method for Landsat image cloud detection without manually labeled data[J]. Remote Sensing for Land & Resources, 2021, 33(1): 102-107.
[10] LIU Zhao, ZHAO Tong, LIAO Feifan, LI Shuai, LI Haiyang. Research and comparative analysis on urban built-up area extraction methods from high-resolution remote sensing image based on semantic segmentation network[J]. Remote Sensing for Land & Resources, 2021, 33(1): 45-53.
[11] CAI Xiang, LI Qi, LUO Yan, QI Jiandong. Surface features extraction of mining area image based on object-oriented and deep-learning method[J]. Remote Sensing for Land & Resources, 2021, 33(1): 63-71.
[12] LIU Zhao, LIAO Feifan, ZHAO Tong. Remote sensing image urban built-up area extraction and optimization method based on PSPNet[J]. Remote Sensing for Land & Resources, 2020, 32(4): 84-89.
[13] LIU Shangwang, CUI Zhiyong, LI Daoyi. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network[J]. Remote Sensing for Land & Resources, 2020, 32(4): 74-83.
[14] LI Yu, XIAO Chunjiao, ZHANG Hongqun, LI Xiangjuan, CHEN Jun. Remote sensing image semantic segmentation using deep fusion convolutional networks and conditional random field[J]. Remote Sensing for Land & Resources, 2020, 32(3): 15-22.
[15] Wenya LIU, Anzhi YUE, Jue JI, Weihua SHI, Ruru DENG, Yeheng LIANG, Longhai XIONG. Urban green space extraction from GF-2 remote sensing image based on DeepLabv3+ semantic segmentation model[J]. Remote Sensing for Land & Resources, 2020, 32(2): 120-129.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
京ICP备05055290号-2
Copyright © 2017 Remote Sensing for Natural Resources
Support by Beijing Magtech