基于全色-多光谱双流卷积网络的端到端地物分类方法

doi:10.6046/zrzyyg.2024208

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(5306 KB)

HTML
输出: BibTeX | EndNote (RIS)

摘要

多光谱（multispectral，MS）影像和全色（panchromatic，PAN）影像是可见-近红外光学遥感影像的主要数据源。在典型的地物分类处理流程中，通常采用像素级融合方法来提高MS影像的空间分辨率，然后再进行影像分类。然而，像素级融合过程通常耗时较长且和地物分类的优化目标不匹配，已无法满足端到端遥感影像分类的需求。为了应对这些挑战，文章提出一种无需进行像素级融合的双流全卷积神经网络DSEUNet。该方法基于EfficientNet-B3网络构建2个分支，分别提取PAN影像和MS影像的特征并进行特征级融合，最后解码输出分类结果。考虑到PAN影像和MS影像表达地物要素的特征侧重点不同，文章在全色分支加入空间注意力机制以提高对细节、边缘等空间信息的感知能力，在多光谱分支加入通道注意力机制以提高对多波段反射率差异的感知能力。10 m地表覆盖数据集生产实验和网络结构消融实验表明，该文提出的网络具有更高的分类精度和更快的推理速度，在保持骨干网络相同的前提下，DSEUNet与传统对像素级融合影像分类的方法相比，分类精度的mIoU提升1.62百分点，mFscore提升1.36百分点，Kappa系数提升1.49百分点，推理速度提升17.69%。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	李英龙
	邓毓弸
	孔赟珑
	陈静波
	孟瑜
	刘帝佑

关键词 ：地物分类, 深度学习, 双流网络, 全色影像, 多光谱影像

Abstract：

Multispectral （MS） and panchromatic （PAN） images serve as primary data sources for visible-near-infrared optical remote sensing imagery. In a typical land cover classification workflow，the spatial resolution of MS images is generally enhanced using pixel-level fusion methods，followed by image classification. However，the pixel-level fusion process is characterized by considerable time consumption and inconsistency with the optimization objectives of land cover classification，failing to meet the demand for end-to-end remote sensing image classification. To address these challenges，this paper proposed a dual-stream fully convolutional neural network，DSEUNet，which obviates the need for pixel-level fusion. Specifically，two branches were constructed based on the EfficientNet-B3 network to extract features from PAN and MS images，respectively. It was followed by feature-level fusion and decoding，thus outputting the ultimate classification results. Considering that PAN and MS images focus on different features of land cover elements，a spatial attention mechanism was incorporated in the PAN branch to enhance the perception of spatial information，such as details and edges. Moreover，a channel attention mechanism was incorporated in the MS branch to improve the perception of reflectance differences across multiple bands. Experiments on the 10-meter land cover dataset and ablation studies of the network structure demonstrate that the proposed network exhibited higher classification accuracy and faster inference speed. With the same backbone network，DSEUNet outperformed traditional pixel-level fusion-based classification methods，with an increase of 1.62 percentage points in mIoU，1.36 percentage points in mFscore，and 1.49 percentage points in Kappa coefficient，as well as a 17.69% improvement in inference speed.

Key words： land cover classification deep learning dual-stream network panchromatic （PAN） image multispectral （MS） image

收稿日期: 2024-06-12 出版日期: 2025-10-28

ZTFLH:

TP79

基金资助:国家重点研发计划课题“开放式遥感智能解译平台”(2021YFB3900504)

通讯作者: 陈静波（1984-），男，副研究员，主要从事智能遥感分析的研究。Email：chenjb@aircas.ac.cn。

作者简介: 李英龙（2000-），男，硕士研究生，主要从事遥感图像智能解译的研究。Email：liyinglong22@mails.ucas.ac.cn。

引用本文:

李英龙, 邓毓弸, 孔赟珑, 陈静波, 孟瑜, 刘帝佑. 基于全色-多光谱双流卷积网络的端到端地物分类方法[J]. 自然资源遥感, 2025, 37(5): 152-161.
LI Yinglong, DENG Yupeng, KONG Yunlong, CHEN Jingbo, MENG Yu, LIU Diyou. End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network. Remote Sensing for Natural Resources, 2025, 37(5): 152-161.

链接本文:

https://www.gtzyyg.com/CN/10.6046/zrzyyg.2024208 或 https://www.gtzyyg.com/CN/Y2025/V37/I5/152

Fig.1 数据集中的MS影像、PAN影像和标签示例

Fig.2 DSEUNet的网络结构

Fig.3 全色-多光谱融合模块的具体结构

Fig.4 空间注意力机制结构图

Fig.5 通道注意力机制结构图

Tab.1 在10 m地表覆盖数据集上不同方法的分类指标

Tab.2 不同方法在3个测试集典型样本上可视化结果展示

Tab.3 不同模型的网络效率指标

Tab.4 不同融合方式的精度指标

Tab.5 注意力机制对DSEUNet的精度指标的影响

Tab.6 加入KA注意力机制前后网络对不同地表覆盖类型关注度的变化

[1]	Munechika C K, Warnick J S, Salvaggio C, et al. Resolution enhancement of multispectral image data to improve classification accuracy[J]. Photogrammetric Engineering and Remote Sensing, 1993, 59:67-72.
[2]	Jia X, Richards J A. Cluster-space representation for hyperspectral data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(3):593-598.
[3]	Li S, Kang X, Fang L, et al. Pixel-level image fusion:A survey of the state of the art[J]. Information Fusion, 2017, 33:100-112.
[4]	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[5]	Li X, Xu F, Lyu X, et al. A remote-sensing image pan-sharpening method based on multi-scale channel attention residual network[J]. IEEE Access, 2020, 8:27163-27177.
[6]	Zhang W, Li J, Hua Z. Attention-based tri-UNet for remote sensing image pan-sharpening[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:3719-3732.
[7]	Ghamisi P, Rasti B, Yokoya N, et al. Multisource and multitemporal data fusion in remote sensing:A comprehensive review of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(1):6-39. doi: 10.1109/MGRS.2018.2890023
[8]	Moser G, De Giorgi A, Serpico S B. Multiresolution supervised classification of panchromatic and multispectral images by Markov random fields and graph cuts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(9):5054-5070.
[9]	Mao T, Tang H, Wu J, et al. A generalized metaphor of Chinese restaurant franchise to fusing both panchromatic and multispectral images for unsupervised classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8):4594-4604.
[10]	Zhang L, Zhang L, Du B. Deep learning for remote sensing data:A technical tutorial on the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2016, 4(2):22-40.
[11]	Liu X, Jiao L, Zhao J, et al. Deep multiple instance learning-based spatial-spectral classification for PAN and MS imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(1):461-473.
[12]	Gaetano R, Ienco D, Ose K, et al. MRFusion:A Deep Learning architecture to fuse PAN and MS imagery for land cover mapping[J/OL]. arXiv, 2018(2018-06-29) [2024-11-03]. https://arxiv.org/abs/1806.11452v1.
[13]	Zhu H, Ma W, Li L, et al. A dual-branch attention fusion deep network for multiresolution remote-sensing image classification[J]. Information Fusion, 2020, 58:116-131.
[14]	Zhu H, Ma M, Ma W, et al. A spatial-channel progressive fusion ResNet for remote sensing classification[J]. Information Fusion, 2021, 70:72-87.
[15]	Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651. doi: 10.1109/TPAMI.2016.2572683 pmid: 27244717
[16]	Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[C]// Proceedings of the Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference.Springer,2015:234-241.
[17]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[18]	Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[C]// Proceedings of the Proceedings of the European Conference on Computer Vision. Springer,2018:7132-7141.
[19]	Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019:3141-3149.
[20]	国家基础地理中心. CH/T 9032—2022 全球地理信息资源数据产品规范[S]. 北京: 中国地图出版社, 2022.
	National Geomatics Center of China. CH/T 9032—2022 Global geographic information resource data product specification[S]. Beijing: China Cartographic Publishing House, 2022.
[21]	Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[J/OL]. arXiv, 2019(2019-05-28). https://arxiv.org/abs/1905.11946v5.
[22]	Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019:1314-1324.
[23]	He K, Zhang X, Ren S, et al. Deep residual learning for image re-cognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:770-778.
[24]	Chen L C, Papandreou G, Kokkinos I, et al. DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[25]	Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[26]	Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[27]	Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10):3349-3364.
[28]	Wang L, Li R, Zhang C, et al. UNetFormer:A UNet-like transfor-mer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190:196-214.
[29]	Li R, Duan C, Zheng S, et al. MACU-net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8007205.
[30]	Li R, Zheng S, Zhang C, et al. ABCNet:Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:84-98.
[31]	Li R, Zheng S, Zhang C, et al. Multiattention network for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60:5607713.
[32]	Li R, Zheng S, Duan C, et al. Multistage attention ResU-net for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8009205.
[33]	Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE,2017:618-626.
[34]	Sun X, Tian Y, Lu W, et al. From single- to multi-modal remote sensing imagery interpretation:A survey and taxonomy[J]. Science China Information Sciences, 2023, 66(4):140301.

[1]	吴志军, 丛铭, 许妙忠, 韩玲, 崔建军, 赵超英, 席江波, 杨成生, 丁明涛, 任超锋, 顾俊凯, 彭晓东, 陶翊婷. 基于视觉双驱动认知的高分辨率遥感影像自学习分割方法[J]. 自然资源遥感, 2025, 37(5): 73-90.
[2]	方留杨, 杨昌浩, 舒东, 杨学昆, 陈兴通, 贾志文. 基于双重特征融合的复杂环境下滑坡检测方法[J]. 自然资源遥感, 2025, 37(5): 91-100.
[3]	陈兰兰, 范永超, 肖海平, 万俊辉, 陈磊. 结合时序InSAR与IRIME-LSTM模型的大范围矿区地表沉降预测[J]. 自然资源遥感, 2025, 37(3): 245-252.
[4]	邹海靖, 邹滨, 王玉龙, 张波, 邹伦文. 基于多尺度样本集优化策略的矿区工业固废及露天采场遥感识别[J]. 自然资源遥感, 2025, 37(3): 1-8.
[5]	郭伟, 李煜, 金海波. 高维上下文注意和双感受野增强的SAR船舶检测[J]. 自然资源遥感, 2025, 37(3): 104-112.
[6]	陈民, 彭栓, 王涛, 吴雪芳, 刘润璞, 陈玉烁, 方艳茹, 阳平坚. 基于资源1号02D高光谱图像湿地水体分类方法对比——以白洋淀为例[J]. 自然资源遥感, 2025, 37(3): 133-141.
[7]	郑宗生, 高萌, 周文睆, 王政翰, 霍志俊, 张月维. 基于样本迭代优化策略的密集连接多尺度土地覆盖语义分割[J]. 自然资源遥感, 2025, 37(2): 11-18.
[8]	马敏, 左震, 韩燕东, 邱野, 乔牡冬. 松嫩平原土壤盐碱化地表基质成因研究[J]. 自然资源遥感, 2025, 37(2): 128-139.
[9]	庞敏. 国产多源卫片图斑智能提取平台研究与应用[J]. 自然资源遥感, 2025, 37(2): 148-154.
[10]	黄川, 李雅琴, 祁越然, 魏晓燕, 邵远征. 基于3D-CAE的高光谱解混及小样本分类方法[J]. 自然资源遥感, 2025, 37(1): 8-14.
[11]	张瑞瑞, 夏浪, 陈立平, 丁晨琛, 郑爱春, 胡新苗, 伊铜川, 陈梅香, 陈天恩. 深度语义分割网络无人机遥感松材线虫病变色木识别[J]. 自然资源遥感, 2024, 36(3): 216-224.
[12]	温泉, 李璐, 熊立, 杜磊, 刘庆杰, 温奇. 基于深度学习的遥感图像水体提取综述[J]. 自然资源遥感, 2024, 36(3): 57-71.
[13]	白石, 唐攀攀, 苗朝, 金彩凤, 赵博, 万昊明. 基于高分辨率遥感影像和改进U-Net模型的滑坡提取——以汶川地区为例[J]. 自然资源遥感, 2024, 36(3): 96-107.
[14]	宋爽爽, 肖开斐, 刘昭华, 曾昭亮. 一种基于YOLOv5的高分辨率遥感影像目标检测方法[J]. 自然资源遥感, 2024, 36(2): 50-59.
[15]	李婉悦, 娄德波, 王成辉, 刘欢, 张长青, 范莹琳, 杜晓川. 基于改进U-Net网络的花岗伟晶岩信息提取方法[J]. 自然资源遥感, 2024, 36(2): 89-96.

Viewed

Full text

Abstract

Cited

Shared

Discussed