Please wait a minute...
 
Remote Sensing for Natural Resources    2024, Vol. 36 Issue (4) : 107-116     DOI: 10.6046/zrzyyg.2023146
|
Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement
QU Haicheng(), LIANG Xu()
College of Software, Liaoning Technical University, Huludao 125105, China
Download: PDF(13511 KB)   HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  

Accurately extracting building information from high-resolution remote sensing images faces challenges due to complex background transformations and the diversity of building shapes. This study developed a high-resolution building semantic segmentation network-building mining net (BMNet), which integrated a hybrid attention mechanism with multi-scale feature enhancement. First, the encoder utilized VGG-16 as the backbone network to extract features, obtaining four layers of feature representations. Then, a decoder was designed to address the issue of detail loss in high-layer features within multi-scale information. Specifically, a series attention module (SAM), which combined channel attention and spatial attention, was introduced to enhance the representation capabilities of high-layer features. Additionally, the building mining module(BMM) with progressive feature enhancement was designed to further improve the accuracy of building segmentation. With the upsampled feature mapping, the feature mapping post-processed using SAM, and initial prediction results as input, the BMM output background noise information and then filtered out background information using the context information exploration module designed in this study. Optimal prediction results were achieved after multiple processing using the BMM. Comparative experiment results indicate that the BMNet outperformed U-Net, with accuracy and intersection over union (IoU) increasing by 4.6% and 4.8%, respectively on the WHU Building dataset, by 7.9% and 8.9%, respectively on the Massachusetts buildings dataset, and by 6.7% and 11.0%, respectively on the Inria Aerial Image Labeling Dataset. These results validate the effectiveness and practicality of the proposed model.

Keywords semantic segmentation      high spatial resolution remote sensing image      building information extraction      U-net      attention mechanism      dilated convolution     
ZTFLH:  TP751  
Issue Date: 23 December 2024
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Haicheng QU
Xu LIANG
Cite this article:   
Haicheng QU,Xu LIANG. Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement[J]. Remote Sensing for Natural Resources, 2024, 36(4): 107-116.
URL:  
https://www.gtzyyg.com/EN/10.6046/zrzyyg.2023146     OR     https://www.gtzyyg.com/EN/Y2024/V36/I4/107
Fig.1  BMNet network structure
Fig.2  Mixed attention module
Fig.3  Channel attention module
Fig.4  Spatial attention module
Fig.5  Building mining module structure
网络组合
模块
说明
BM-0 基线: 为仅使用VGG-16作为主干的U-net网络
BM-1 基线+空间注意力模块(SA)
BM-2 基线+空间注意力模块+通道注意力模块(SA+CA)
BM-3 基线+ SA+CA +BMM(不添加CEB模块)
BM-4 基线+ SA+CA +BMM(添加CEB模块)
BM-5 基线+通道注意力模块(CA)
BM-6 基线+ CA+BMM(不添加CEB)
BM-7 基线+ CA+BMM(添加CEB)
BM-8 基线+ SA+BMM(不添加CEB)
BM-9 基线+ SA+BMM(添加CEB)
BM-10 基线+BMM(不添加CEB)
BM-11 基线+BMM(添加CEB)
Tab.1  Combination structure of BMNet
网络组合模块 IoU F1 Precision Recall
BM-0 71.40 83.33 84.60 82.10
BM-1 72.67 86.96 89.36 84.68
BM-2 74.51 88.60 90.68 86.62
BM-3 79.78 89.35 91.35 87.44
BM-4 81.93 90.07 91.78 88.41
BM-5 73.28 85.06 85.96 84.17
BM-6 78.51 85.72 86.54 84.92
BM-7 80.66 86.45 87.02 85.88
BM-8 77.94 87.66 90.10 85.35
BM-9 80.09 87.30 88.92 85.74
BM-10 76.67 84.91 86.32 83.55
BM-11 78.82 85.69 86.79 84.62
Tab.2  Results of ablation experiment(%)
Fig.6  Figures of intermediate results of ablation experiment
Tab.3  Qualitative comparison experiment results of WHU Building Dataset
方法 IoU/% F1/% Precision/% Recall/% Param/MB
VGG16-U-net 86.91 92.42 90.41 94.52 18.26
DeepLab V3+ 89.07 94.23 94.15 94.32 30.61
STT 90.48 94.97 94.85 95.09 21.02
ST-UNet 89.65 95.16 26.47
BMNet 91.73 95.14 95.01 95.27 21.34
Tab.4  Quantitative comparison experiment results of WHU Building Dataset
Tab.5  Qualitative comparison experiment results of Massachusetts Buildings Dataset
方法 IoU/% F1/% Precision/% Recall/% Param/MB
VGG16-U-net 66.92 80.65 79.15 82.21 18.26
DeepLab V3+ 69.16 81.38 83.85 79.05 30.61
BANet 72.20 83.86 83.07 84.66 38.72
ST-UNet 71.26 86.49 26.47
BMNet 75.79 85.58 87.02 84.19 21.34
Tab.6  Quantitative comparison experiment results of Massachusetts Buildings Dataset
Tab.7  Qualitative comparison experiment results of Inria Aerial Image Labeling Dateset
方法 IoU/% F1/% Precision/% Recall/% Param/MB
VGG16-U-net 70.92 83.12 85.07 81.26 18.26
DeepLab V3+ 74.52 83.47 81.82 85.19 30.61
SwinUperNet 79.37 88.19 87.43 88.96 65.84
ST-UNet 70.14 91.36 26.47
BMNet 81.93 90.06 91.78 88.41 21.34
Tab.8  Quantitative comparison experiment results of Inria Aerial Image Labeling Dateset
Fig.7  Comparison graph of parameter count and IoU data
[1] 徐宗学, 程涛, 洪思扬, 等. 遥感技术在城市洪涝模拟中的应用进展[J]. 科学通报, 2018, 63(21):2156-2166.
[1] Xu Z X, Cheng T, Hong S Y, et al. Review on applications of remote sensing in urban flood modeling[J]. Chinese Science Bulletin, 2018, 63(21):2156-2166.
[2] 向煜, 黄志, 华媛媛, 等. 深度学习支持下的宅基地复垦项目真实性智能审查技术研究与应用[J]. 测绘通报, 2023(1):163-167.
doi: 10.13474/j.cnki.11-2246.2023.0028
[2] Xiang Y, Huang Z, Hua Y Y, et al. Research and application of intelligent verification technology for authenticity of homestead reclamation project based on deep learning[J]. Bulletin of Surveying and Mapping, 2023(1):163-167.
doi: 10.13474/j.cnki.11-2246.2023.0028
[3] 张莹, 郭红梅, 尹文刚, 等. 基于特征提取的SVM图像分类技术的无人机遥感建筑物震害识别应用研究[J]. 灾害学, 2022, 37(4):30-36,56.
[3] Zhang Y, Guo H M, Yin W G, et al. Application of SVM image classification technology based on feature extraction in seismic damage identification of buildings by UAV remote sensing[J]. Journal of Catastrophology, 2022, 37(4):30-36,56.
[4] Nielsen M M. Remote sensing for urban planning and management:The use of window-independent context segmentation to extract urban features in Stockholm[J]. Computers,Environment and Urban Systems, 2015, 52:1-9.
[5] Karantzalos K, Paragios N. Recognition-driven two-dimensional competing priors toward automatic and accurate building detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2009, 47(1):133-144.
[6] Huang X, Zhang L. An SVM ensemble approach combining spectral,structural,and semantic features for the classification of high-resolution remotely sensed imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2013, 51(1):257-272.
[7] Belgiu M, Drǎgut L. Comparing supervised and unsupervised multiresolution segmentation approaches for extracting buildings from very high resolution imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 96:67-75.
pmid: 25284960
[8] Wang Y, Wang C, Zhang H. Integrating H-A-α with fully convolutional networks for fully PolSAR classification[C]// 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP). May 18-21,2017,Shanghai,China.IEEE, 2017:1-4.
[9] Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015:234-241.
[10] Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018:833-851.
[11] Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 15-20,2019,Long Beach,CA,USA.IEEE, 2019:3141-3149.
[12] 金澍, 关沫, 边玉婵, 等. 基于改进U-Net的遥感影像建筑物提取方法[J]. 激光与光电子学进展, 2023, 60(4):3788/LOP213004.
[12] Jin S, Guan M, Bian Y C, et al. Building extraction from remote sensing images based on improved U-Net[J]. Laser & Optoelectronics Progress, 2023, 60(4):3788/LOP213004.
[13] Yu M, Chen X, Zhang W, et al. AGs-unet:Building extraction model for high resolution remote sensing images based on attention gates U network[J]. Sensors, 2022, 22(8):2932.
[14] He X, Zhou Y, Zhao J, et al. Swin transformer embedding UNet for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:4408715.
[15] Liu Z, Lin Y, Cao Y, et al. Swin transformer:Hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). October 10-17,2021,Montreal,QC,Canada.IEEE, 2021:9992-10002.
[16] Wang L, Li R, Wang D, et al. Transformer meets convolution:A bilateral awareness network for semantic segmentation of very fine resolution urban scene images[J]. Remote Sensing, 2021, 13(16):3065.
[17] Chen K, Zou Z, Shi Z. Building extraction from remote sensing images with sparse token transformers[J]. Remote Sensing, 2021, 13(21):4441.
[18] Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 15-20,2019,Long Beach,CA,USA.IEEE, 2019:5686-5696.
[19] Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018:3-19.
[20] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. June 18-23,2018,Salt Lake City,UT,USA.IEEE, 2018:7132-7141.
[21] Jaderberg M, Simonyan K, Zisserman A, et al. Spatial transformer networks[J]. Advances in Neural Information Processing Systems, 2015,28.
[22] Ji S, Wei S, Lu M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(1):574-586.
[23] Maggiori E, Tarabalka Y, Charpiat G, et al. Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark[C]// 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). July 23-28,2017,Fort Worth,TX,USA.IEEE, 2017:3226-3229.
[24] Zhou Y, Chen Z, Wang B, et al. BOMSC-net:Boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5618617.
[25] Fang L, Zhang L, Nie D, et al. Automatic brain labeling via multi-atlas guided fully convolutional networks[J]. Medical Image Analysis, 2019, 51:157-168.
doi: S1361-8415(18)30860-0 pmid: 30447544
[26] Zhang H, Liao Y, Yang H, et al. A local-global dual-stream network for building extraction from very-high-resolution remote sensing images[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(3):1269-1283.
[27] Guo H, Du B, Zhang L, et al. A coarse-to-fine boundary refinement network for building footprint extraction from remote sensing imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 183:240-252.
[28] Liu Z, Shi Q, Ou J. LCS:A collaborative optimization framework of vector extraction and semantic segmentation for building extraction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5632615.
[1] ZHENG Zongsheng, WANG Zhenghan, WANG Zhenhua, LU Peng, GAO Meng, HUO Zhijun. An improved 3D Octave convolution-based method for hyperspectral image classification[J]. Remote Sensing for Natural Resources, 2024, 36(4): 82-91.
[2] PAN Junjie, SHEN Li, YAN Xin, NIE Xin, DONG Kuanlin. An adversarial learning-based unsupervised domain adaptation method for semantic segmentation of high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(4): 149-157.
[3] LI Shiqi, YAO Guoqing. A landslide detection method using CNN- and SETR-based feature fusion[J]. Remote Sensing for Natural Resources, 2024, 36(4): 158-164.
[4] ZHAO Jinling, HUANG Jian, LIANG Zijun, ZHAO Xuedan, JIN Tao, GE Hanghang, WEI Xiaoyan, SHAO Yuanzheng. BDANet-based assessment of building damage from earthquake disasters[J]. Remote Sensing for Natural Resources, 2024, 36(4): 193-200.
[5] SU Tengfei. A comparative study on semantic segmentation-orientated deep convolutional networks for remote sensing image-based farmland classification: A case study of the Hetao irrigation district[J]. Remote Sensing for Natural Resources, 2024, 36(4): 210-217.
[6] LUO Wei, LI Xiuhua, QIN Huojuan, ZHANG Muqing, WANG Zeping, JIANG Zhuhui. Identification and yield prediction of sugarcane in the south-central part of Guangxi Zhuang Autonomous Region, China based on multi-source satellite-based remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(3): 248-258.
[7] BAI Shi, TANG Panpan, MIAO Zhao, JIN Caifeng, ZHAO Bo, WAN Haoming. Information extraction of landslides based on high-resolution remote sensing images and an improved U-Net model: A case study of Wenchuan, Sichuan[J]. Remote Sensing for Natural Resources, 2024, 36(3): 96-107.
[8] LI Wanyue, LOU Debo, WANG Chenghui, LIU Huan, ZHANG Changqing, FAN Yinglin, DU Xiaochuan. A granitic pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2): 89-96.
[9] DENG Dingzhu. Deep learning-based cloud detection method for multi-source satellite remote sensing images[J]. Remote Sensing for Natural Resources, 2023, 35(4): 9-16.
[10] CHEN Di, PENG Qiuzhi, HUANG Peiyi, LIU Yaxuan. Detecting land for photovoltaic development based on the attention mechanism and improved YOLOv5[J]. Remote Sensing for Natural Resources, 2023, 35(4): 90-95.
[11] NIU Xianghua, HUANG Wei, HUANG Rui, JIANG Sili. A high-fidelity method for thin cloud removal from remote sensing images based on attentional feature fusion[J]. Remote Sensing for Natural Resources, 2023, 35(3): 116-123.
[12] LIN Jiahui, LIU Guang, FAN Jinghui, ZHAO Hongli, BAI Shibiao, PAN Hongyu. Extracting information about mining subsidence by combining an improved U-Net model and D-InSAR[J]. Remote Sensing for Natural Resources, 2023, 35(3): 145-152.
[13] LIU Li, DONG Xianmin, LIU Juan. A performance evaluation method for semantic segmentation models of remote sensing images considering surface features[J]. Remote Sensing for Natural Resources, 2023, 35(3): 80-87.
[14] ZHENG Zongsheng, LIU Haixia, WANG Zhenhua, LU Peng, SHEN Xukun, TANG Pengfei. Improved 3D-CNN-based method for surface feature classification using hyperspectral images[J]. Remote Sensing for Natural Resources, 2023, 35(2): 105-111.
[15] JIN Yuanhang, XU Maolin, ZHENG Jiayuan. A dead tree detection algorithm based on improved YOLOv4-tiny for UAV images[J]. Remote Sensing for Natural Resources, 2023, 35(1): 90-98.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
京ICP备05055290号-2
Copyright © 2017 Remote Sensing for Natural Resources
Support by Beijing Magtech