结合上下文与类别感知特征融合的高分遥感图像语义分割

doi:10.6046/zrzyyg.2023312

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(5493 KB)

HTML
输出: BibTeX | EndNote (RIS)

摘要

为了解决遥感图像语义分割任务中上下文依赖关系提取不足、空间细节信息损失导致分割精度下降等问题,提出了一种结合上下文与类别感知特征融合的语义分割方法。该方法首先以ResNet-50作为特征提取的主干网络,并在下采样中采用注意力模块,以增强特征表示和上下文依赖关系的提取; 然后在跳跃连接上构建大尺寸的感受野块,提取丰富的多尺度上下文信息,以减少目标之间尺度变化的影响; 其后并联场景特征关联融合模块,以全局特征来引导局部特征融合; 最后在解码器部分构建类别预测模块和类别感知特征融合模块,准确融合底层的高级语义信息与高层的细节信息。将所提方法在Potsdam和Vaihingen数据集上验证可行性,并与DeepLabv3+,BuildFormer等6种常用方法进行对比实验,以验证其先进性。实验结果表明,所提方法在Recall,F1-score和Accuracy指标上均优于其他方法,尤其是对建筑物分割的交并比(intersection over union,IoU)在2个数据集上分别达到90.44%和86.74%,较次优网络DeepLabv3+和A²FPN分别提升了1.55%和2.41%。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	何晓军
	罗杰

关键词 ：类别感知, 语义分割, 遥感图像, 上下文信息, 特征融合

Abstract：

To address the accuracy reduction in the semantic segmentation of remote sensing images due to insufficient extraction of contextual dependencies and loss of spatial details, this study proposed a semantic segmentation method based on context- and class-aware feature fusion. With ResNet-50 as the backbone network for feature extraction, the proposed method incorporates the attention module during downsampling to enhance feature representation and contextual dependency extraction. It constructs a large receptive field block on skip connections to extract rich multiscale contextual information, thereby mitigating the impacts of scale variations between targets. Furthermore, it connects a scene feature association and fusion module in parallel behind the block to guide local feature fusion based on global features. Finally, it constructs a class prediction module and a class-aware feature fusion module in the decoder part to accurately fuse the low-level advanced semantic information with high-level detailed information. The proposed method was validated on the Potsdam and Vaihingen datasets and compared with six commonly used methods, including DeepLabv3+ and BuildFormer, to verify its effectiveness. Experimental results demonstrate that the proposed method outperformed other methods in terms of recall, F1-score, and accuracy. Particularly, it yielded intersection over union (IoU) values of 90.44% and 86.74% for building segmentation, achieving improvements of 1.55% and 2.41%, respectively, compared to suboptimal networks DeepLabv3+ and A2FPN.

Key words： class-aware semantic segmentation remote sensing image contextual information feature fusion

收稿日期: 2023-10-14 出版日期: 2025-05-09

ZTFLH:

TP751

基金资助:辽宁省教育厅科学研究经费项目“基于智能多主体的并行化海量遥感影像分割方法研究”(LJKZ0350)

通讯作者: 罗杰(1995-),男,硕士研究生,研究方向为遥感图像处理。Email: 1349876941@qq.com。

作者简介: 何晓军(1975-),男,博士,副教授,主要从事遥感影像处理、人工智能、大数据处理等方面的研究。Email: hexiaojun@lntu.edu.cn。

引用本文:

何晓军, 罗杰. 结合上下文与类别感知特征融合的高分遥感图像语义分割[J]. 自然资源遥感, 2025, 37(2): 1-10.
HE Xiaojun, LUO Jie. Semantic segmentation of high-resolution remote sensing images based on context- and class-aware feature fusion. Remote Sensing for Natural Resources, 2025, 37(2): 1-10.

链接本文:

https://www.gtzyyg.com/CN/10.6046/zrzyyg.2023312 或 https://www.gtzyyg.com/CN/Y2025/V37/I2/1

Fig.1 CCFFSM网络结构

Fig.2 DAM_CAM模块

Fig.3 大尺寸的感受野块

Fig.4 SCM模块

Fig.5 CPM模块

Fig.6 类别感知特征融合模块

Tab.1 Potsdam和Vaihingen数据集

Tab.2 在Potsdam数据集上的实验结果

Fig.7 不同方法在Potsdam数据集上的部分可视化结果

Tab.3 在Vaihingen数据集上的实验结果

Fig.8 不同方法在Vaihingen数据集上的部分可视化结果

Tab.4 Potsdam数据集IoU得分

Tab.5 Vaihingen数据集IoU得分

Fig.9 CCFFSM在Potsdam数据集上的全局分割效果

Fig.10 CCFFSM在Vaihingen数据集上的全局分割效果

Tab.6 CCFFSM方法消融实验结果

[1]	刘钊, 赵桐, 廖斐凡, 等. 基于语义分割网络的高分遥感影像城市建成区提取方法研究与对比分析[J]. 国土资源遥感, 2021, 33(1):45-53.doi:10.6046/gtzyyg.2020162.
	Liu Z, Zhao T, Liao F F, et al. Research and comparative analysis on urban built-up area extraction methods from high-resolution remote sensing image based on semantic segmentation network[J]. Remote Sensing for Land and Resources, 2021, 33(1):45-53.doi:10.6046/gtzyyg.2020162.
[2]	Zhang T, Su J, Liu C, et al. State and parameter estimation of the AquaCrop model for winter wheat using sensitivity informed particle filter[J]. Computers and Electronics in Agriculture, 2021, 180:105909.
[3]	Feng S, Fan Y, Tang Y, et al. A change detection method based on multi-scale adaptive convolution kernel network and multimodal conditional random field for multi-temporal multispectral images[J]. Remote Sensing, 2022, 14(21):5368.
[4]	于航, 安娜, 汪洁, 等. 黔西南采煤塌陷区高分遥感动态监测——以六盘水市煤矿采空塌陷区为例[J]. 自然资源遥感, 2023, 35(3):310-318.doi:10.6046/zrzyyg.2022170.
	Yu H, An N, Wang J, et al. High-resolution remote sensing-based dynamic monitoring of coal mine collapse areas in southwestern Guizhou:A case study of coal mine collapse areas in Liupanshui City[J]. Remote Sensing for Natural Resources, 2023, 35(3):310-318.doi:10.6046/zrzyyg.2022170.
[5]	Tian R, Sun G, Liu X, et al. Sobel edge detection based on weighted nuclear norm minimization image denoising[J]. Electronics, 2021, 10(6):655.
[6]	Yang J, He Y, Caspersen J. Region merging using local spectral angle thresholds:A more accurate method for hybrid segmentation of remote sensing images[J]. Remote Sensing of Environment, 2017, 190:137-148.
[7]	Zhang X, Feng X, Xiao P, et al. Segmentation quality evaluation using region-based precision and recall measures for remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2015, 102:73-84.
[8]	赫晓慧, 陈明扬, 李盼乐, 等. 结合DCNN与短距条件随机场的遥感影像道路提取[J]. 武汉大学学报(信息科学版), 2024, 49(3):333-342.
	He X H, Chen M Y, Li P L, et al. Road extraction from remote sensing image by integrating DCNN with short range conditional random field[J]. Geomatics and Information Science of Wuhan University, 2024, 49(3):333-342.
[9]	Qi G, Zhang Y, Wang K, et al. Small object detection method based on adaptive spatial parallel convolution and fast multi-scale fusion[J]. Remote Sensing, 2022, 14(2):420.
[10]	龙丽红, 朱宇霆, 闫敬文, 等. 新型语义分割D-UNet的建筑物提取[J]. 遥感学报, 2023, 27(11):2593-2602.
	Long L H, Zhu Y T, Yan J W, et al. New building extraction method based on semantic segmentation[J]. National Remote Sensing Bulletin, 2023, 27(11):2593-2602.
[11]	Zhu Z, Luo Y, Qi G, et al. Remote sensing image defogging networks based on dual self-attention boost residual octave convolution[J]. Remote Sensing, 2021, 13(16):3104.
[12]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]// Conference on Computer Vision and Pattern Recognition.IEEE, 2015:640-651.
[13]	Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015:234-241.
[14]	李婉悦, 娄德波, 王成辉, 等. 基于改进U-Net网络的花岗伟晶岩信息提取方法[J]. 自然资源遥感, 2024, 36(2):89-96.doi:10.6046/zrzyyg.2022500.
	Li W Y, Lou D B, Wang C H, et al. Research on granite-pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2):89-96.doi:10.6046/zrzyyg.2022500.
[15]	Pan X, Yang F, Gao L, et al. Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms[J]. Remote Sensing, 2019, 11(8):917.
[16]	刘尚旺, 崔智勇, 李道义. 基于Unet网络多任务学习的遥感图像建筑地物语义分割[J]. 国土资源遥感, 2020, 32(4):74-83.doi:10.6046/gtzyyg.2020.04.11.
	Liu S W, Cui Z Y, Li D Y. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network[J]. Remote Sensing for Land & Resources, 2020, 32(4):74-83.doi:10.6046/gtzyyg.2020.04.11.
[17]	Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu,HI,USA.IEEE, 2017:6230-6239.
[18]	Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Computer Vision-ECCV 2018:15th European Conference,Munich,Germany,September 8-14,2018,Proceedings,Part VII.ACM, 2018:833-851.
[19]	Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimationC]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach,CA,USA.IEEE, 2019:5686-5696.
[20]	曲海成, 梁旭. 融合混合注意力机制与多尺度特征增强的高分影像建筑物提取[J]. 自然资源遥感, 2024, 36(4):107-116.doi:10.6046/zrzyyg.2023146.
	Qu H C, Liang X. Fusion of hybrid attention mechanism and multi-scale feature enhancement for high-resolution satellite image building extraction[J]. Remote Sensing for Natural Resources, 2024, 36(4):107-116.doi:10.6046/zrzyyg.2023146.
[21]	Li H, Qiu K, Chen L, et al. SCAttNet:Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(5):905-909.
[22]	张印辉, 张枫, 何自芬, 等. 注意力引导与多特征融合的遥感影像分割[J]. 光学学报, 2023, 43(24):3788/AOS230631.
	Zhang Y H, Zhang F, He Z F, et al. Remote sensing image segmentation based on attention guidance and multi-feature fusion[J]. Acta Optica Sinica, 2023, 43(24):3788/AOS230631.
[23]	Li R, Wang L, Zhang C, et al. A²-FPN for semantic segmentation of fine-resolution remotely sensed images[J]. International Journal of Remote Sensing, 2022, 43(3):1131-1155.
[24]	Wang L, Fang S, Meng X, et al. Building extraction with vision Transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5625711.
[25]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:Transformers for image recognition at scale[J/OL]. 2020:arXiv:2010.11929. http://arxiv.org/abs/2010.11929.
[26]	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas,NV,USA.IEEE, 2016:770-778.
[27]	Liu S, Huang D. Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018:385-400.

[1]	郑宗生, 高萌, 周文睆, 王政翰, 霍志俊, 张月维. 基于样本迭代优化策略的密集连接多尺度土地覆盖语义分割[J]. 自然资源遥感, 2025, 37(2): 11-18.
[2]	庞敏. 国产多源卫片图斑智能提取平台研究与应用[J]. 自然资源遥感, 2025, 37(2): 148-154.
[3]	陈佳雪, 肖东升, 陈虹宇. 一种边界引导与跨尺度信息交互网络用于遥感影像水体提取[J]. 自然资源遥感, 2025, 37(1): 15-23.
[4]	徐欣钰, 李小军, 盖钧飞, 李轶鲲. 结合NSCT变换和引导滤波的多光谱图像全色锐化算法[J]. 自然资源遥感, 2025, 37(1): 24-30.
[5]	刘晨晨, 葛小三, 武永斌, 余海坤, 张蓓蓓. 基于混合注意力机制和Deeplabv3+的遥感影像建筑物提取方法[J]. 自然资源遥感, 2025, 37(1): 31-37.
[6]	曲海成, 王莹, 刘腊梅, 郝明. 融合CNN与Transformer的遥感影像道路信息提取[J]. 自然资源遥感, 2025, 37(1): 38-45.
[7]	曲海成, 梁旭. 融合混合注意力机制与多尺度特征增强的高分影像建筑物提取[J]. 自然资源遥感, 2024, 36(4): 107-116.
[8]	潘俊杰, 慎利, 鄢薪, 聂欣, 董宽林. 一种基于对抗学习的高分辨率遥感影像语义分割无监督域自适应方法[J]. 自然资源遥感, 2024, 36(4): 149-157.
[9]	李世琦, 姚国清. 基于CNN与SETR的特征融合滑坡体检测[J]. 自然资源遥感, 2024, 36(4): 158-164.
[10]	苏腾飞. 深度卷积语义分割网络在农田遥感影像分类中的对比研究——以河套灌区为例[J]. 自然资源遥感, 2024, 36(4): 210-217.
[11]	郭彭浩, 邱建林, 赵淑男. 高频域多深度空洞网络的遥感图像全色锐化算法[J]. 自然资源遥感, 2024, 36(3): 146-153.
[12]	邰佳怡, 慎利, 乔文凡, 周吾珍. 不同上下文比例对损毁建筑遥感场景图片样本集构建的影响[J]. 自然资源遥感, 2024, 36(3): 154-162.
[13]	罗维, 李修华, 覃火娟, 张木清, 王泽平, 蒋柱辉. 基于多源卫星遥感影像的广西中南部地区甘蔗识别及产量预测[J]. 自然资源遥感, 2024, 36(3): 248-258.
[14]	温泉, 李璐, 熊立, 杜磊, 刘庆杰, 温奇. 基于深度学习的遥感图像水体提取综述[J]. 自然资源遥感, 2024, 36(3): 57-71.
[15]	白石, 唐攀攀, 苗朝, 金彩凤, 赵博, 万昊明. 基于高分辨率遥感影像和改进U-Net模型的滑坡提取——以汶川地区为例[J]. 自然资源遥感, 2024, 36(3): 96-107.

Viewed

Full text

Abstract

Cited

Shared

Discussed