Please wait a minute...
 
自然资源遥感  2025, Vol. 37 Issue (5): 152-161    DOI: 10.6046/zrzyyg.2024208
  技术方法 本期目录 | 过刊浏览 | 高级检索 |
基于全色-多光谱双流卷积网络的端到端地物分类方法
李英龙1,2(), 邓毓弸1, 孔赟珑1, 陈静波1(), 孟瑜1, 刘帝佑1
1.中国科学院空天信息创新研究院国家遥感应用工程技术研发中心,北京 100094
2.中国科学院大学电子电气与通信工程学院,北京 100049
End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network
LI Yinglong1,2(), DENG Yupeng1, KONG Yunlong1, CHEN Jingbo1(), MENG Yu1, LIU Diyou1
1. National Engineering Research Center for Geoinformatics,Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100094,China
2. School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 100049,China
全文: PDF(5306 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

多光谱(multispectral,MS)影像和全色(panchromatic,PAN)影像是可见-近红外光学遥感影像的主要数据源。在典型的地物分类处理流程中,通常采用像素级融合方法来提高MS影像的空间分辨率,然后再进行影像分类。然而,像素级融合过程通常耗时较长且和地物分类的优化目标不匹配,已无法满足端到端遥感影像分类的需求。为了应对这些挑战,文章提出一种无需进行像素级融合的双流全卷积神经网络DSEUNet。该方法基于EfficientNet-B3网络构建2个分支,分别提取PAN影像和MS影像的特征并进行特征级融合,最后解码输出分类结果。考虑到PAN影像和MS影像表达地物要素的特征侧重点不同,文章在全色分支加入空间注意力机制以提高对细节、边缘等空间信息的感知能力,在多光谱分支加入通道注意力机制以提高对多波段反射率差异的感知能力。10 m地表覆盖数据集生产实验和网络结构消融实验表明,该文提出的网络具有更高的分类精度和更快的推理速度,在保持骨干网络相同的前提下,DSEUNet与传统对像素级融合影像分类的方法相比,分类精度的mIoU提升1.62百分点,mFscore提升1.36百分点,Kappa系数提升1.49百分点,推理速度提升17.69%。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李英龙
邓毓弸
孔赟珑
陈静波
孟瑜
刘帝佑
关键词 地物分类深度学习双流网络全色影像多光谱影像    
Abstract

Multispectral (MS) and panchromatic (PAN) images serve as primary data sources for visible-near-infrared optical remote sensing imagery. In a typical land cover classification workflow,the spatial resolution of MS images is generally enhanced using pixel-level fusion methods,followed by image classification. However,the pixel-level fusion process is characterized by considerable time consumption and inconsistency with the optimization objectives of land cover classification,failing to meet the demand for end-to-end remote sensing image classification. To address these challenges,this paper proposed a dual-stream fully convolutional neural network,DSEUNet,which obviates the need for pixel-level fusion. Specifically,two branches were constructed based on the EfficientNet-B3 network to extract features from PAN and MS images,respectively. It was followed by feature-level fusion and decoding,thus outputting the ultimate classification results. Considering that PAN and MS images focus on different features of land cover elements,a spatial attention mechanism was incorporated in the PAN branch to enhance the perception of spatial information,such as details and edges. Moreover,a channel attention mechanism was incorporated in the MS branch to improve the perception of reflectance differences across multiple bands. Experiments on the 10-meter land cover dataset and ablation studies of the network structure demonstrate that the proposed network exhibited higher classification accuracy and faster inference speed. With the same backbone network,DSEUNet outperformed traditional pixel-level fusion-based classification methods,with an increase of 1.62 percentage points in mIoU,1.36 percentage points in mFscore,and 1.49 percentage points in Kappa coefficient,as well as a 17.69% improvement in inference speed.

Key wordsland cover classification    deep learning    dual-stream network    panchromatic (PAN) image    multispectral (MS) image
收稿日期: 2024-06-12      出版日期: 2025-10-28
ZTFLH:  TP79  
基金资助:国家重点研发计划课题“开放式遥感智能解译平台”(2021YFB3900504)
通讯作者: 陈静波(1984-),男,副研究员,主要从事智能遥感分析的研究。Email:chenjb@aircas.ac.cn
作者简介: 李英龙(2000-),男,硕士研究生,主要从事遥感图像智能解译的研究。Email:liyinglong22@mails.ucas.ac.cn
引用本文:   
李英龙, 邓毓弸, 孔赟珑, 陈静波, 孟瑜, 刘帝佑. 基于全色-多光谱双流卷积网络的端到端地物分类方法[J]. 自然资源遥感, 2025, 37(5): 152-161.
LI Yinglong, DENG Yupeng, KONG Yunlong, CHEN Jingbo, MENG Yu, LIU Diyou. End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network. Remote Sensing for Natural Resources, 2025, 37(5): 152-161.
链接本文:  
https://www.gtzyyg.com/CN/10.6046/zrzyyg.2024208      或      https://www.gtzyyg.com/CN/Y2025/V37/I5/152
Fig.1  数据集中的MS影像、PAN影像和标签示例
Fig.2  DSEUNet的网络结构
Fig.3  全色-多光谱融合模块的具体结构
Fig.4  空间注意力机制结构图
Fig.5  通道注意力机制结构图
类别 方法 IoU mIoU OA mFscore Kappa
耕地 植被 人造地表 裸地 水域
经典的语义
分割网络
Deeplabv3 90.94 44.83 44.93 82.12 88.28 58.52 87.48 80.62 78.16
PspNet 90.77 44.57 46.58 82.55 87.76 58.70 87.57 80.86 78.36
SegNet 91.07 43.73 43.38 83.84 89.16 58.53 87.76 80.43 78.27
HRNet 91.34 45.94 50.74 83.29 88.88 60.03 88.44 82.15 79.72
EfficientUNet 92.10 51.17 55.39 81.93 89.81 61.73 89.59 83.92 81.74
遥感领域的语
义分割网络
UNetFormer 91.41 46.22 47.61 83.36 88.95 59.59 88.20 81.66 79.33
MACUNet 91.48 45.97 52.95 83.93 88.95 60.55 88.80 82.64 80.38
ABCNet 91.97 49.11 58.34 84.34 88.95 62.12 89.85 84.21 82.07
MANet 92.43 51.62 57.42 85.24 89.72 62.74 90.06 84.74 82.68
MAResUNet 92.03 49.69 57.37 86.62 89.62 62.55 89.96 84.50 82.29
DSEUNet(本文方法) 92.53 52.86 58.33 86.77 89.59 63.35 90.42 85.28 83.23
Tab.1  在10 m地表覆盖数据集上不同方法的分类指标
样本编号 PAN影像 MS影像 标签 DSEUNet EfficientUnet MANet MAResUnet
样本1
样本2
样本编号 PAN影像 MS影像 标签 DSEUNet EfficientUnet MANet MAResUnet
样本3
图例
Tab.2  不同方法在3个测试集典型样本上可视化结果展示
方法 GFLOPS 参数量/106 FPS
Deeplabv3 198.16 10.36 12.94
PspNet 181.60 10.33 10.99
SegNet 1 452.12 29.45 8.72
HRNet 166.31 9.64 15.73
EfficientUNet 132.15 14.09 10.29
UNetFormer 11.77 107.40 17.55
MACUNet 268.06 5.15 10.20
ABCNet 144.19 13.67 9.22
MANet 403.75 35.86 10.32
MAResUNet 255.84 26.28 15.67
DSEUNet 105.81 34.67 12.11
Tab.3  不同模型的网络效率指标
融合方式 mIoU mFscore Kappa
自适应融合 63.35 85.28 83.23
通道拼接 62.61 84.72 82.71
相加 62.57 84.67 82.70
相加相减后通道拼接 62.55 84.61 82.67
Tab.4  不同融合方式的精度指标
注意力 mIoU mFscore Kappa
62.34 84.34 82.08
KA(只保留通道注意力) 62.52 84.50 82.14
KA(只保留空间注意力) 63.16 85.11 83.05
CBAM 62.74 84.80 82.75
KA 63.35 85.28 83.23
Tab.5  注意力机制对DSEUNet的精度指标的影响
样本编号 PAN影像 MS影像 标签 网络对耕地类别的关注度 网络对植被类别的关注度
加入注意力
机制前
加入注意力
机制后
加入注意力
机制前
加入注意力
机制后
样本1
样本2
样本2
图例
Tab.6  加入KA注意力机制前后网络对不同地表覆盖类型关注度的变化
[1] Munechika C K, Warnick J S, Salvaggio C, et al. Resolution enhancement of multispectral image data to improve classification accuracy[J]. Photogrammetric Engineering and Remote Sensing, 1993, 59:67-72.
[2] Jia X, Richards J A. Cluster-space representation for hyperspectral data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(3):593-598.
[3] Li S, Kang X, Fang L, et al. Pixel-level image fusion:A survey of the state of the art[J]. Information Fusion, 2017, 33:100-112.
[4] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[5] Li X, Xu F, Lyu X, et al. A remote-sensing image pan-sharpening method based on multi-scale channel attention residual network[J]. IEEE Access, 2020, 8:27163-27177.
[6] Zhang W, Li J, Hua Z. Attention-based tri-UNet for remote sensing image pan-sharpening[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:3719-3732.
[7] Ghamisi P, Rasti B, Yokoya N, et al. Multisource and multitemporal data fusion in remote sensing:A comprehensive review of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(1):6-39.
doi: 10.1109/MGRS.2018.2890023
[8] Moser G, De Giorgi A, Serpico S B. Multiresolution supervised classification of panchromatic and multispectral images by Markov random fields and graph cuts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(9):5054-5070.
[9] Mao T, Tang H, Wu J, et al. A generalized metaphor of Chinese restaurant franchise to fusing both panchromatic and multispectral images for unsupervised classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8):4594-4604.
[10] Zhang L, Zhang L, Du B. Deep learning for remote sensing data:A technical tutorial on the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2016, 4(2):22-40.
[11] Liu X, Jiao L, Zhao J, et al. Deep multiple instance learning-based spatial-spectral classification for PAN and MS imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(1):461-473.
[12] Gaetano R, Ienco D, Ose K, et al. MRFusion:A Deep Learning architecture to fuse PAN and MS imagery for land cover mapping[J/OL]. arXiv, 2018(2018-06-29) [2024-11-03]. https://arxiv.org/abs/1806.11452v1.
[13] Zhu H, Ma W, Li L, et al. A dual-branch attention fusion deep network for multiresolution remote-sensing image classification[J]. Information Fusion, 2020, 58:116-131.
[14] Zhu H, Ma M, Ma W, et al. A spatial-channel progressive fusion ResNet for remote sensing classification[J]. Information Fusion, 2021, 70:72-87.
[15] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.
doi: 10.1109/TPAMI.2016.2572683 pmid: 27244717
[16] Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[C]// Proceedings of the Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference.Springer,2015:234-241.
[17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[18] Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[C]// Proceedings of the Proceedings of the European Conference on Computer Vision. Springer,2018:7132-7141.
[19] Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019:3141-3149.
[20] 国家基础地理中心. CH/T 9032—2022 全球地理信息资源数据产品规范[S]. 北京: 中国地图出版社, 2022.
National Geomatics Center of China. CH/T 9032—2022 Global geographic information resource data product specification[S]. Beijing: China Cartographic Publishing House, 2022.
[21] Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[J/OL]. arXiv, 2019(2019-05-28). https://arxiv.org/abs/1905.11946v5.
[22] Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019:1314-1324.
[23] He K, Zhang X, Ren S, et al. Deep residual learning for image re-cognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:770-778.
[24] Chen L C, Papandreou G, Kokkinos I, et al. DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[25] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[26] Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[27] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10):3349-3364.
[28] Wang L, Li R, Zhang C, et al. UNetFormer:A UNet-like transfor-mer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190:196-214.
[29] Li R, Duan C, Zheng S, et al. MACU-net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8007205.
[30] Li R, Zheng S, Zhang C, et al. ABCNet:Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:84-98.
[31] Li R, Zheng S, Zhang C, et al. Multiattention network for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60:5607713.
[32] Li R, Zheng S, Duan C, et al. Multistage attention ResU-net for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8009205.
[33] Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE,2017:618-626.
[34] Sun X, Tian Y, Lu W, et al. From single- to multi-modal remote sensing imagery interpretation:A survey and taxonomy[J]. Science China Information Sciences, 2023, 66(4):140301.
[1] 吴志军, 丛铭, 许妙忠, 韩玲, 崔建军, 赵超英, 席江波, 杨成生, 丁明涛, 任超锋, 顾俊凯, 彭晓东, 陶翊婷. 基于视觉双驱动认知的高分辨率遥感影像自学习分割方法[J]. 自然资源遥感, 2025, 37(5): 73-90.
[2] 方留杨, 杨昌浩, 舒东, 杨学昆, 陈兴通, 贾志文. 基于双重特征融合的复杂环境下滑坡检测方法[J]. 自然资源遥感, 2025, 37(5): 91-100.
[3] 陈兰兰, 范永超, 肖海平, 万俊辉, 陈磊. 结合时序InSAR与IRIME-LSTM模型的大范围矿区地表沉降预测[J]. 自然资源遥感, 2025, 37(3): 245-252.
[4] 邹海靖, 邹滨, 王玉龙, 张波, 邹伦文. 基于多尺度样本集优化策略的矿区工业固废及露天采场遥感识别[J]. 自然资源遥感, 2025, 37(3): 1-8.
[5] 郭伟, 李煜, 金海波. 高维上下文注意和双感受野增强的SAR船舶检测[J]. 自然资源遥感, 2025, 37(3): 104-112.
[6] 陈民, 彭栓, 王涛, 吴雪芳, 刘润璞, 陈玉烁, 方艳茹, 阳平坚. 基于资源1号02D高光谱图像湿地水体分类方法对比——以白洋淀为例[J]. 自然资源遥感, 2025, 37(3): 133-141.
[7] 郑宗生, 高萌, 周文睆, 王政翰, 霍志俊, 张月维. 基于样本迭代优化策略的密集连接多尺度土地覆盖语义分割[J]. 自然资源遥感, 2025, 37(2): 11-18.
[8] 马敏, 左震, 韩燕东, 邱野, 乔牡冬. 松嫩平原土壤盐碱化地表基质成因研究[J]. 自然资源遥感, 2025, 37(2): 128-139.
[9] 庞敏. 国产多源卫片图斑智能提取平台研究与应用[J]. 自然资源遥感, 2025, 37(2): 148-154.
[10] 黄川, 李雅琴, 祁越然, 魏晓燕, 邵远征. 基于3D-CAE的高光谱解混及小样本分类方法[J]. 自然资源遥感, 2025, 37(1): 8-14.
[11] 张瑞瑞, 夏浪, 陈立平, 丁晨琛, 郑爱春, 胡新苗, 伊铜川, 陈梅香, 陈天恩. 深度语义分割网络无人机遥感松材线虫病变色木识别[J]. 自然资源遥感, 2024, 36(3): 216-224.
[12] 温泉, 李璐, 熊立, 杜磊, 刘庆杰, 温奇. 基于深度学习的遥感图像水体提取综述[J]. 自然资源遥感, 2024, 36(3): 57-71.
[13] 白石, 唐攀攀, 苗朝, 金彩凤, 赵博, 万昊明. 基于高分辨率遥感影像和改进U-Net模型的滑坡提取——以汶川地区为例[J]. 自然资源遥感, 2024, 36(3): 96-107.
[14] 宋爽爽, 肖开斐, 刘昭华, 曾昭亮. 一种基于YOLOv5的高分辨率遥感影像目标检测方法[J]. 自然资源遥感, 2024, 36(2): 50-59.
[15] 李婉悦, 娄德波, 王成辉, 刘欢, 张长青, 范莹琳, 杜晓川. 基于改进U-Net网络的花岗伟晶岩信息提取方法[J]. 自然资源遥感, 2024, 36(2): 89-96.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
京ICP备05055290号-2
版权所有 © 2015 《自然资源遥感》编辑部
地址:北京学院路31号中国国土资源航空物探遥感中心 邮编:100083
电话:010-62060291/62060292 E-mail:zrzyyg@163.com
本系统由北京玛格泰克科技发展有限公司设计开发