|
Abstract Automatic extraction of buildings from satellite remote sensing images has a wide range of applications in the development of economy and society. Due to the influence of mutual occlusion, illumination, background environment and other factors in satellite remote sensing images, it is difficult for traditional methods to achieve high-precision building extraction. This paper proposes an attention enhanced feature pyramid network (FPN-SENet) and constructs a large-scale pixel-wise building dataset (SCRS dataset) by using multi-source high-resolution satellite images and vector data to realize the automatic extraction of buildings from multi-source satellite images, and compares it with the other full convolution neural networks. The results show that the accuracy of building extracted from SCRS dataset is close to the world’s leading open source satellite image dataset, and the accuracy of Pseudo color data is higher than that of true color data The accuracy of FPN-SENet is better than that of other full convolution neural networks. The extraction of building can also be improved by using the sum of cross entropy and Dice coefficient as the loss function. The overall accuracy of the best classification model is 95.2%, Kappa coefficient is 79.0%, and F1-score and IoU are 81.7% and 69.1% respectively. This study can provide a reference for building automatic extraction from high-resolution satellite images.
|
Keywords
Chinese high-resolution satellite imagery
buildings
semantic segmentation
attention enhancement
|
|
Corresponding Authors:
ZHANG Qiao
E-mail: 451362006@qq.com;scrs_qiaozh@163.com
|
Issue Date: 21 July 2021
|
|
|
[1] |
Blaschke T. Object based image analysis for remote sensing[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2010, 65(1):2-16.
doi: 10.1016/j.isprsjprs.2009.06.004
url: https://linkinghub.elsevier.com/retrieve/pii/S0924271609000884
|
[2] |
Yang Y, Newsam S. Geographic image retrieval using local invariant features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2013, 51(2):818-832.
doi: 10.1109/TGRS.2012.2205158
url: http://ieeexplore.ieee.org/document/6257473/
|
[3] |
Li E, Femiani J, Xu S B, et al. Robust rooftop extraction from visible band images using higher order CRF[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(8):4483-4495.
doi: 10.1109/TGRS.2015.2400462
url: http://ieeexplore.ieee.org/document/7047875/
|
[4] |
Melgani F, Bruzzone L. Classification of hyperspectral remote sensing images with support vector machines[J]. IEEE Transactions on Geoscience and Remote Sensing, 2004, 42(8):1778-1790.
doi: 10.1109/TGRS.2004.831865
url: http://ieeexplore.ieee.org/document/1323134/
|
[5] |
Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18:1527-1554.
doi: 10.1162/neco.2006.18.7.1527
url: https://direct.mit.edu/neco/article/18/7/1527-1554/7065
|
[6] |
Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[C]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
|
[7] |
Graves A, Liwicki M, Fernandez S, et al. A novel connectionist system for unconstrained handwriting recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31:855-868.
doi: 10.1109/TPAMI.2008.137
pmid: 19299860
|
[8] |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems, 2012.
|
[9] |
Mikolov T, Deoras A, Kombrink S, et al. Empirical evaluation and combination of advanced language modeling techniques[C]// Florence:Conference of the International Speech Communication Association, 2011.
|
[10] |
Dahl G E, Yu D, Deng L, et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition[J]. IEEE Transactions on Audio, Speech and Language Processing, 2012, 20(1):30-42.
doi: 10.1109/TASL.2011.2134090
url: http://ieeexplore.ieee.org/document/5740583/
|
[11] |
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition,Boston,MA,USA, 2015.
|
[12] |
Wu G M, Shao X W, Guo Z L, et al. Automatic building segmentation of aerial imagery using multi-constraint fully convolutional networks[J]. Remote Sensing, 2018, 10(3):407-424.
doi: 10.3390/rs10030407
url: http://www.mdpi.com/2072-4292/10/3/407
|
[13] |
Zhang W K, Huang H, Schmitz M, et al. Effective fusion of multi-modal remote sensing data in a fully convolutional network for semantic labeling[J]. Remote Sensing, 2018, 10(52):1-14.
doi: 10.3390/rs10010001
url: http://www.mdpi.com/2072-4292/10/1/1
|
[14] |
Badrinarayanan V, Kendall A, Cipolla R. Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(12):2481-2495.
doi: 10.1109/TPAMI.34
url: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34
|
[15] |
Ronneberger O, Fischer P, Brox T. U-Net:Convolutional networks for biomedical image segmentation[C]// Medical Image Computing and Computer-Assisted Intervention, 2015.
|
[16] |
Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2017.
|
[17] |
Chen L C, Papandreou G, Kokkinos I, et al. Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018, 40:834-848.
|
[18] |
Mnih V. Machine learning for aerial image labeling[M]. Toronto:Toronto University of Toronto, 2013.
|
[19] |
Maggiori E, Tarabalka Y, Charpiat G, et al. Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark[C]// IEEE International Geoscience and Remote Sensing Symposium(IGARSS),Fort Worth,United States, 2017.
|
[20] |
季顺平, 魏世清. 遥感影像建筑物提取的卷积神经元网络与开源数据集方法[J]. 测绘学报, 2019, 48(4):448-459.
|
[20] |
Ji S P, Wei S Q. Building extraction via convolutional neural networks from an open remote sensing building dataset[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(4):448-459.
|
[21] |
Lin S Y, Doll'ar P, Girshick R, et al. Feature pyramid networks for object detection[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2017.
|
[22] |
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2018.
|
[23] |
杨建宇, 周振旭, 杜贞容, 等. 基于segnet语义模型的高分辨率遥感影像农村建设用地提取[J]. 农业工程学报, 2019, 35(5):251-258.
|
[23] |
Yang J Y, Zhou Z X, Du Z R, et al. Rural construction land extraction from high spatial resolution remote sensing image based on segnet semantic segmentation model[J]. Transactions of the Chinese Society of Agricultural Engineering, 2019, 35(5):251-258.
|
[24] |
国务院第一次全国地理国情普查领导小组办公室. 地理国情普查内容与指标[M]. 北京: 测绘出版社, 2013.
|
[24] |
The First National Geographic National Conditions Census Leading Group Office of State Council. Geographic national conditions census content and indicator[M]. Beijing: Surveying and Mapping Press, 2013.
|
[25] |
Congalton R G. A review of assessing the accuracy of classifications of remotely sensed data[J]. Remote Sensing of Environment, 1991, 37(1):35-46.
doi: 10.1016/0034-4257(91)90048-B
url: https://linkinghub.elsevier.com/retrieve/pii/003442579190048B
|
[26] |
Maggiori E, Tarabalka Y, Charpiat G, et al. Convolutional neural networks for large-scale remote-sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55:645-657.
doi: 10.1109/TGRS.2016.2612821
url: http://ieeexplore.ieee.org/document/7592858/
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|