Semantic segmentation of high-resolution remote sensing images based on context- and class-aware feature fusion

doi:10.6046/zrzyyg.2023312

Abstract
Figures/Tables
References
Related Articles
Metrics

Download: PDF(5493 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

To address the accuracy reduction in the semantic segmentation of remote sensing images due to insufficient extraction of contextual dependencies and loss of spatial details, this study proposed a semantic segmentation method based on context- and class-aware feature fusion. With ResNet-50 as the backbone network for feature extraction, the proposed method incorporates the attention module during downsampling to enhance feature representation and contextual dependency extraction. It constructs a large receptive field block on skip connections to extract rich multiscale contextual information, thereby mitigating the impacts of scale variations between targets. Furthermore, it connects a scene feature association and fusion module in parallel behind the block to guide local feature fusion based on global features. Finally, it constructs a class prediction module and a class-aware feature fusion module in the decoder part to accurately fuse the low-level advanced semantic information with high-level detailed information. The proposed method was validated on the Potsdam and Vaihingen datasets and compared with six commonly used methods, including DeepLabv3+ and BuildFormer, to verify its effectiveness. Experimental results demonstrate that the proposed method outperformed other methods in terms of recall, F1-score, and accuracy. Particularly, it yielded intersection over union (IoU) values of 90.44% and 86.74% for building segmentation, achieving improvements of 1.55% and 2.41%, respectively, compared to suboptimal networks DeepLabv3+ and A2FPN.

Keywords class-aware semantic segmentation remote sensing image contextual information feature fusion

ZTFLH:

TP751

Issue Date: 09 May 2025

	Service

	E-mail this article
	E-mail Alert
	RSS
	Articles by authors

	Xiaojun HE
	Jie LUO

Cite this article:

Xiaojun HE,Jie LUO. Semantic segmentation of high-resolution remote sensing images based on context- and class-aware feature fusion[J]. Remote Sensing for Natural Resources, 2025, 37(2): 1-10.

URL:

https://www.gtzyyg.com/EN/10.6046/zrzyyg.2023312 OR https://www.gtzyyg.com/EN/Y2025/V37/I2/1

Fig.1 CCFFSM network structure

Fig.2 DAM_CAM module

Fig.3 Large receptive field block

Fig.4 Scene-context feature fusion module

Fig.5 Category prediction module

Fig.6 Class-aware feature fusion module

Tab.1 Potsdam and Vaihingen datasets

Tab.2 Experimental results on the Potsdam dataset (%)

Fig.7 Partial visualization results of different methods on the Potsdam dataset

Tab.3 Experimental results on the Vaihingen dataset (%)

Fig.8 Partial visualization results of different methods on the Vaihingen dataset

Tab.4 IoU scores on the Potsdam (%)

Tab.5 IoU scores on the Vaihingen dataset (%)

Fig.9 Global segmentation performance of CCFFSM on the Potsdam dataset

Fig.10 Global segmentation performance of CCFFSM on the Vaihingen dataset

Tab.6 Ablation experiment results of CCFFSM method (%)

[1]	刘钊, 赵桐, 廖斐凡, 等. 基于语义分割网络的高分遥感影像城市建成区提取方法研究与对比分析[J]. 国土资源遥感, 2021, 33(1):45-53.doi:10.6046/gtzyyg.2020162.
[1]	Liu Z, Zhao T, Liao F F, et al. Research and comparative analysis on urban built-up area extraction methods from high-resolution remote sensing image based on semantic segmentation network[J]. Remote Sensing for Land and Resources, 2021, 33(1):45-53.doi:10.6046/gtzyyg.2020162.
[2]	Zhang T, Su J, Liu C, et al. State and parameter estimation of the AquaCrop model for winter wheat using sensitivity informed particle filter[J]. Computers and Electronics in Agriculture, 2021, 180:105909.
[3]	Feng S, Fan Y, Tang Y, et al. A change detection method based on multi-scale adaptive convolution kernel network and multimodal conditional random field for multi-temporal multispectral images[J]. Remote Sensing, 2022, 14(21):5368.
[4]	于航, 安娜, 汪洁, 等. 黔西南采煤塌陷区高分遥感动态监测——以六盘水市煤矿采空塌陷区为例[J]. 自然资源遥感, 2023, 35(3):310-318.doi:10.6046/zrzyyg.2022170.
[4]	Yu H, An N, Wang J, et al. High-resolution remote sensing-based dynamic monitoring of coal mine collapse areas in southwestern Guizhou:A case study of coal mine collapse areas in Liupanshui City[J]. Remote Sensing for Natural Resources, 2023, 35(3):310-318.doi:10.6046/zrzyyg.2022170.
[5]	Tian R, Sun G, Liu X, et al. Sobel edge detection based on weighted nuclear norm minimization image denoising[J]. Electronics, 2021, 10(6):655.
[6]	Yang J, He Y, Caspersen J. Region merging using local spectral angle thresholds:A more accurate method for hybrid segmentation of remote sensing images[J]. Remote Sensing of Environment, 2017, 190:137-148.
[7]	Zhang X, Feng X, Xiao P, et al. Segmentation quality evaluation using region-based precision and recall measures for remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2015, 102:73-84.
[8]	赫晓慧, 陈明扬, 李盼乐, 等. 结合DCNN与短距条件随机场的遥感影像道路提取[J]. 武汉大学学报(信息科学版), 2024, 49(3):333-342.
[8]	He X H, Chen M Y, Li P L, et al. Road extraction from remote sensing image by integrating DCNN with short range conditional random field[J]. Geomatics and Information Science of Wuhan University, 2024, 49(3):333-342.
[9]	Qi G, Zhang Y, Wang K, et al. Small object detection method based on adaptive spatial parallel convolution and fast multi-scale fusion[J]. Remote Sensing, 2022, 14(2):420.
[10]	龙丽红, 朱宇霆, 闫敬文, 等. 新型语义分割D-UNet的建筑物提取[J]. 遥感学报, 2023, 27(11):2593-2602.
[10]	Long L H, Zhu Y T, Yan J W, et al. New building extraction method based on semantic segmentation[J]. National Remote Sensing Bulletin, 2023, 27(11):2593-2602.
[11]	Zhu Z, Luo Y, Qi G, et al. Remote sensing image defogging networks based on dual self-attention boost residual octave convolution[J]. Remote Sensing, 2021, 13(16):3104.
[12]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]// Conference on Computer Vision and Pattern Recognition.IEEE, 2015:640-651.
[13]	Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015:234-241.
[14]	李婉悦, 娄德波, 王成辉, 等. 基于改进U-Net网络的花岗伟晶岩信息提取方法[J]. 自然资源遥感, 2024, 36(2):89-96.doi:10.6046/zrzyyg.2022500.
[14]	Li W Y, Lou D B, Wang C H, et al. Research on granite-pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2):89-96.doi:10.6046/zrzyyg.2022500.
[15]	Pan X, Yang F, Gao L, et al. Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms[J]. Remote Sensing, 2019, 11(8):917.
[16]	刘尚旺, 崔智勇, 李道义. 基于Unet网络多任务学习的遥感图像建筑地物语义分割[J]. 国土资源遥感, 2020, 32(4):74-83.doi:10.6046/gtzyyg.2020.04.11.
[16]	Liu S W, Cui Z Y, Li D Y. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network[J]. Remote Sensing for Land & Resources, 2020, 32(4):74-83.doi:10.6046/gtzyyg.2020.04.11.
[17]	Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu,HI,USA.IEEE, 2017:6230-6239.
[18]	Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Computer Vision-ECCV 2018:15th European Conference,Munich,Germany,September 8-14,2018,Proceedings,Part VII.ACM, 2018:833-851.
[19]	Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimationC]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach,CA,USA.IEEE, 2019:5686-5696.
[20]	曲海成, 梁旭. 融合混合注意力机制与多尺度特征增强的高分影像建筑物提取[J]. 自然资源遥感, 2024, 36(4):107-116.doi:10.6046/zrzyyg.2023146.
[20]	Qu H C, Liang X. Fusion of hybrid attention mechanism and multi-scale feature enhancement for high-resolution satellite image building extraction[J]. Remote Sensing for Natural Resources, 2024, 36(4):107-116.doi:10.6046/zrzyyg.2023146.
[21]	Li H, Qiu K, Chen L, et al. SCAttNet:Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(5):905-909.
[22]	张印辉, 张枫, 何自芬, 等. 注意力引导与多特征融合的遥感影像分割[J]. 光学学报, 2023, 43(24):3788/AOS230631.
[22]	Zhang Y H, Zhang F, He Z F, et al. Remote sensing image segmentation based on attention guidance and multi-feature fusion[J]. Acta Optica Sinica, 2023, 43(24):3788/AOS230631.
[23]	Li R, Wang L, Zhang C, et al. A²-FPN for semantic segmentation of fine-resolution remotely sensed images[J]. International Journal of Remote Sensing, 2022, 43(3):1131-1155.
[24]	Wang L, Fang S, Meng X, et al. Building extraction with vision Transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5625711.
[25]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:Transformers for image recognition at scale[J/OL]. 2020:arXiv:2010.11929. http://arxiv.org/abs/2010.11929. url: http://arxiv.org/abs/2010.11929
[26]	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas,NV,USA.IEEE, 2016:770-778.
[27]	Liu S, Huang D. Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018:385-400.

[1]	ZHENG Zongsheng, GAO Meng, ZHOU Wenhuan, WANG Zhenghan, HUO Zhijun, ZHANG Yuewei. Densely connected multiscale semantic segmentation for land cover based on the iterative optimization strategy for samples[J]. Remote Sensing for Natural Resources, 2025, 37(2): 11-18.
[2]	PANG Min. An intelligent platform for extracting patches from multisource domestic satellite images and its application[J]. Remote Sensing for Natural Resources, 2025, 37(2): 148-154.
[3]	NIE Shiyin, LIU Yansong, LI Huiling, XUE Kailun, SHEN Duheng, HE Boyu. Identification and classification of land types of alpine wetlands based on spectral coupling[J]. Remote Sensing for Natural Resources, 2025, 37(2): 204-211.
[4]	DENG Jianming, YAO Hang, FU Bolin, GU Sen, TANG Jie, GAN Yuanyuan. Monitoring the spatiotemporal dynamics of mangrove forests in Beibu Gulf, Guangxi Zhuang Autonomous Region, China, using Google Earth Engine and time-series active and passive remote sensing images[J]. Remote Sensing for Natural Resources, 2025, 37(2): 235-245.
[5]	CHEN Jiaxue, XIAO Dongsheng, CHEN Hongyu. A boundary guidance and cross-scale information interaction network for water body extraction from remote sensing images[J]. Remote Sensing for Natural Resources, 2025, 37(1): 15-23.
[6]	XU Xinyu, LI Xiaojun, GE Junfei, LI Yikun. A multispectral image pansharpening algorithm based on nonsubsampled contourlet transform (NSCT) combined with a guided filter[J]. Remote Sensing for Natural Resources, 2025, 37(1): 24-30.
[7]	WANG Tixin, YANG Jinzhong, XING Yu, WANG Kaijian. A method for automatic mapping of the remote sensing monitoring results of national nature reserves based on ArcPy and map optimization[J]. Remote Sensing for Natural Resources, 2025, 37(1): 252-259.
[8]	LIU Chenchen, GE Xiaosan, WU Yongbin, YU Haikun, ZHANG Beibei. A method for information extraction of buildings from remote sensing images based on hybrid attention mechanism and Deeplabv3+[J]. Remote Sensing for Natural Resources, 2025, 37(1): 31-37.
[9]	QU Haicheng, WANG Ying, LIU Lamei, HAO Ming. Information extraction of roads from remote sensing images using CNN combined with Transformer[J]. Remote Sensing for Natural Resources, 2025, 37(1): 38-45.
[10]	QU Haicheng, LIANG Xu. Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement[J]. Remote Sensing for Natural Resources, 2024, 36(4): 107-116.
[11]	PAN Junjie, SHEN Li, YAN Xin, NIE Xin, DONG Kuanlin. An adversarial learning-based unsupervised domain adaptation method for semantic segmentation of high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(4): 149-157.
[12]	LI Shiqi, YAO Guoqing. A landslide detection method using CNN- and SETR-based feature fusion[J]. Remote Sensing for Natural Resources, 2024, 36(4): 158-164.
[13]	ZHAO Jinling, HUANG Jian, LIANG Zijun, ZHAO Xuedan, JIN Tao, GE Hanghang, WEI Xiaoyan, SHAO Yuanzheng. BDANet-based assessment of building damage from earthquake disasters[J]. Remote Sensing for Natural Resources, 2024, 36(4): 193-200.
[14]	SU Tengfei. A comparative study on semantic segmentation-orientated deep convolutional networks for remote sensing image-based farmland classification: A case study of the Hetao irrigation district[J]. Remote Sensing for Natural Resources, 2024, 36(4): 210-217.
[15]	TAI Jiayi, SHEN Li, QIAO Wenfan, ZHOU Wuzhen. Impacts of different proportions of contextual information on the construction of sample sets of remote sensing scene images for damaged buildings[J]. Remote Sensing for Natural Resources, 2024, 36(3): 154-162.

Viewed

Full text

Abstract

Cited

Shared

Discussed