SD-BASNet:a building extraction network for high-spatial-resolution remote sensing imagery
ZHU Juanjuan1,2(), HUANG Liang1,3(), ZHU Shasha4
1. Kunming University of Science and Technology,Faculty of Land Resource Engineering,Kunming 650093,China 2. Yunnan Institute of Surveying and Mapping of Geology and Mineral Resources Co.,Ltd.,Kunming 650218,China 3. Yunnan International Joint Laboratory for Integrated Sky-Ground Intelligent Monitoring of Mountain Hazards,Kunming 650093,China 4. Kunming General Survey of Natural Resources Center,China Geological Survey,Kunming 650100,China
In response to the challenges posed by substantial parameters and the loss of building details during downsampling,this study,inspired by lightweight networks,designed a building extraction network (SD-BASNet) incorporating depthwise separable residual blocks and dilated convolution. First,a depthwise separable residual block was designed in the prediction module of the deep supervision encoder-decoder. Depthwise separable convolution was incorporated into the backbone ResNet to prevent oversized convolutional kernels and reduce the number of network parameters. Second,to mitigate the potential decline in accuracy due to network lightweighting,dilated convolution was integrated into the encoder layer of the post-processing optimization module. This strategy effectively expands the receptive field of feature maps,thereby capturing broader contextual information and enhancing the accuracy of building feature extraction. Experiments on the WHU building dataset showed that the proposed network achieved an mIoU of 92.25%,an mPA of 96.59%,a Recall of 96.50%,a Precision of 93.79%,and a F1-score of 92.61%. Compared with current semantic segmentation networks,including PSPNet,SegNet,DeepLabV3,SE-UNet,and UNet++,the SD-BASNet demonstrated significantly improved accuracy and better completeness of building extraction. Compared with the baseline BASNet,the SD-BASNet also exhibited reductions in both parameter count and runtime,demonstrating its effectiveness.
Zhang Z E, Pan J, Shu Q D. Building extraction based on dual-stream detail-concerned network[J]. Geomatics and Information Science of Wuhan University, 2024, 49(3):376-388.
Li Z, Sui Z W, Fu Q Y, et al. High-resolution remote sensing extraction of urban buildings based on morphological sequences and multi-source a priori information[J]. National Remote Sensing Bulletin, 2023, 27(4):998-1008.
Zhang Y Z, Guo W, Wu C Y. Fast extraction of buildings from remote sensing images by fusion of CNN and Transformer[J]. Optics and Precision Engineering, 2023, 31(11):1700-1709.
[4]
Otsu N. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems,Man,and Cybernetics, 1979, 9(1):62-66.
[5]
Zhang M, Zhang L, Cheng H D. A neutrosophic approach to image segmentation based on watershed method[J]. Signal Processing, 2010, 90(5):1510-1517.
[6]
Prewitt J M S. Object enhancement and extraction[J]. Picture Processing and Psychopictorics, 1970, 10(1):15-19.
[7]
Luo L, Li P, Yan X. Deep learning-based building extraction from remote sensing images:A comprehensive review[J]. Energies, 2021, 14(23):7982.
Li X H, Bai X C, Li Z J, et al. High-resolution image building extraction based on multi-level feature fusion network[J]. Geomatics and Information Science of Wuhan University, 2022, 47(8):1236-1244.
[9]
Diwan T, Anirudh G, Tembhurne J V. Object detection using YOLO:Challenges,architectural successors,datasets and applications[J]. Multimedia Tools and Applications, 2023, 82(6):9243-9275.
[10]
Tahraoui A, Kheddam R, Belhadj-Aissa A. Land change detection in sentinel-2 images using IR-MAD and deep neural network[C]//2023 International Conference on Earth Observation and Geo-Spatial Information (ICEOGI). IEEE, 2023:1-6.
[11]
Feng W, Sui H, Hua L, et al. Building extraction from VHR remote sensing imagery by combining an improved deep convolutional encoder-decoder architecture and historical land use vector map[J]. International Journal of Remote Sensing, 2020, 41(17):6595-6617.
[12]
Hosseinpoor H, Samadzadegan F. Convolutional neural network for building extraction from high-resolution remote sensing images[C]//2020 International Conference on Machine Vision and Ima-ge Processing (MVIP). IEEE, 2020:1-5.
[13]
Ji S, Wei S, Lu M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(1):574-586.
[14]
Bouvrie J. Notes on convolutional neural networks[J]. In Practice,2006:47-60.
[15]
Cai Y, Chen D, Tang Y, et al. Multi-scale building instance extraction framework in high resolution remote sensing imagery based on feature pyramid object-aware convolution neural network[C]//2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS. IEEE,2021:2779-2782.
[16]
Das P, Chand S. AttentionBuildNet for building extraction from ae-rial imagery[C]// 2021 International Conference on Computing,Communication,and Intelligent Systems (ICCCIS). IEEE,2021:576-580.
[17]
Zhang Z, Zhang C, Li W. Semantic segmentation of urban buildings from VHR remotely sensed imagery using attention-based CNN[C]// IEEE International Geoscience and Remote Sensing Symposium. IEEE,2020:1833-1836.
Wang H J, Ge X S. Lightweight DeepLabv3+ building extraction method from remote sensing images[J]. Remote Sensing for Natural Resources, 2022, 34(2):128-135.doi:10.6046/zrzyyg.2021219.
[19]
Qin X, Fan D P, Huang C, et al. Boundary-aware segmentation network for mobile and web applications[J/OL]. 2021: 2101.04704. https://arxiv.org/abs/2101.04704v2.
[20]
Chollet F. Xception:Deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:1800-1807.
[21]
Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J/OL]. 2015: 1511.07122. https://arxiv.org/abs/1511.07122v3.
[22]
Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J/OL]. 2017: 1704.04861. https://arxiv.org/abs/1704.04861v1.
[23]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[24]
Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[J/OL].2019: 1905.11946. https://arxiv.org/abs/1905.11946v5.
Ji S P, Wei S Q. Building extraction via convolutional neural networks from an open remote sensing building dataset[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(4):448-459.
doi: 10.11947/j.AGCS.2019.20180206
[26]
Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[27]
Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
doi: 10.1109/TPAMI.2016.2644615
pmid: 28060704
[28]
Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J/OL]. 2017: 1706.05587. https://arxiv.org/abs/1706.05587v3.
Liu H, Luo J C, Huang B, et al. Building extraction based on SE-unet[J]. Journal of Geo-Information Science, 2019, 21(11):1779-1789.
[30]
Zhou Z, Siddiquee M M R, Tajbakhsh N, et al. UNet:Redesigning skip connections to exploit multiscale features in image segmentation[J]. IEEE Transactions on Medical Imaging, 2020, 39(6):1856-1867.