Please wait a minute...
 
Remote Sensing for Natural Resources    2025, Vol. 37 Issue (5) : 152-161     DOI: 10.6046/zrzyyg.2024208
|
End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network
LI Yinglong1,2(), DENG Yupeng1, KONG Yunlong1, CHEN Jingbo1(), MENG Yu1, LIU Diyou1
1. National Engineering Research Center for Geoinformatics,Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100094,China
2. School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 100049,China
Download: PDF(5306 KB)   HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  

Multispectral (MS) and panchromatic (PAN) images serve as primary data sources for visible-near-infrared optical remote sensing imagery. In a typical land cover classification workflow,the spatial resolution of MS images is generally enhanced using pixel-level fusion methods,followed by image classification. However,the pixel-level fusion process is characterized by considerable time consumption and inconsistency with the optimization objectives of land cover classification,failing to meet the demand for end-to-end remote sensing image classification. To address these challenges,this paper proposed a dual-stream fully convolutional neural network,DSEUNet,which obviates the need for pixel-level fusion. Specifically,two branches were constructed based on the EfficientNet-B3 network to extract features from PAN and MS images,respectively. It was followed by feature-level fusion and decoding,thus outputting the ultimate classification results. Considering that PAN and MS images focus on different features of land cover elements,a spatial attention mechanism was incorporated in the PAN branch to enhance the perception of spatial information,such as details and edges. Moreover,a channel attention mechanism was incorporated in the MS branch to improve the perception of reflectance differences across multiple bands. Experiments on the 10-meter land cover dataset and ablation studies of the network structure demonstrate that the proposed network exhibited higher classification accuracy and faster inference speed. With the same backbone network,DSEUNet outperformed traditional pixel-level fusion-based classification methods,with an increase of 1.62 percentage points in mIoU,1.36 percentage points in mFscore,and 1.49 percentage points in Kappa coefficient,as well as a 17.69% improvement in inference speed.

Keywords land cover classification      deep learning      dual-stream network      panchromatic (PAN) image      multispectral (MS) image     
ZTFLH:  TP79  
Issue Date: 28 October 2025
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Yinglong LI
Yupeng DENG
Yunlong KONG
Jingbo CHEN
Yu MENG
Diyou LIU
Cite this article:   
Yinglong LI,Yupeng DENG,Yunlong KONG, et al. End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network[J]. Remote Sensing for Natural Resources, 2025, 37(5): 152-161.
URL:  
https://www.gtzyyg.com/EN/10.6046/zrzyyg.2024208     OR     https://www.gtzyyg.com/EN/Y2025/V37/I5/152
Fig.1  Examples of multispectral,panchromatic and label images in the dataset
Fig.2  Network structure of DSEUNet
Fig.3  The structure of the fusion module
Fig.4  Structure of spatial attention mechanism
Fig.5  Structure of channel attention mechanism
类别 方法 IoU mIoU OA mFscore Kappa
耕地 植被 人造地表 裸地 水域
经典的语义
分割网络
Deeplabv3 90.94 44.83 44.93 82.12 88.28 58.52 87.48 80.62 78.16
PspNet 90.77 44.57 46.58 82.55 87.76 58.70 87.57 80.86 78.36
SegNet 91.07 43.73 43.38 83.84 89.16 58.53 87.76 80.43 78.27
HRNet 91.34 45.94 50.74 83.29 88.88 60.03 88.44 82.15 79.72
EfficientUNet 92.10 51.17 55.39 81.93 89.81 61.73 89.59 83.92 81.74
遥感领域的语
义分割网络
UNetFormer 91.41 46.22 47.61 83.36 88.95 59.59 88.20 81.66 79.33
MACUNet 91.48 45.97 52.95 83.93 88.95 60.55 88.80 82.64 80.38
ABCNet 91.97 49.11 58.34 84.34 88.95 62.12 89.85 84.21 82.07
MANet 92.43 51.62 57.42 85.24 89.72 62.74 90.06 84.74 82.68
MAResUNet 92.03 49.69 57.37 86.62 89.62 62.55 89.96 84.50 82.29
DSEUNet(本文方法) 92.53 52.86 58.33 86.77 89.59 63.35 90.42 85.28 83.23
Tab.1  Classification index of different methods on 10 m land cover dataset (%)
样本编号 PAN影像 MS影像 标签 DSEUNet EfficientUnet MANet MAResUnet
样本1
样本2
样本编号 PAN影像 MS影像 标签 DSEUNet EfficientUnet MANet MAResUnet
样本3
图例
Tab.2  Visualization of results on typical test set images using different methods
方法 GFLOPS 参数量/106 FPS
Deeplabv3 198.16 10.36 12.94
PspNet 181.60 10.33 10.99
SegNet 1 452.12 29.45 8.72
HRNet 166.31 9.64 15.73
EfficientUNet 132.15 14.09 10.29
UNetFormer 11.77 107.40 17.55
MACUNet 268.06 5.15 10.20
ABCNet 144.19 13.67 9.22
MANet 403.75 35.86 10.32
MAResUNet 255.84 26.28 15.67
DSEUNet 105.81 34.67 12.11
Tab.3  Network efficiency indicators of different models
融合方式 mIoU mFscore Kappa
自适应融合 63.35 85.28 83.23
通道拼接 62.61 84.72 82.71
相加 62.57 84.67 82.70
相加相减后通道拼接 62.55 84.61 82.67
Tab.4  Precision indexes using different fusion methods (%)
注意力 mIoU mFscore Kappa
62.34 84.34 82.08
KA(只保留通道注意力) 62.52 84.50 82.14
KA(只保留空间注意力) 63.16 85.11 83.05
CBAM 62.74 84.80 82.75
KA 63.35 85.28 83.23
Tab.5  The influence of attention mechanism on the precision index of DSEUNet (%)
样本编号 PAN影像 MS影像 标签 网络对耕地类别的关注度 网络对植被类别的关注度
加入注意力
机制前
加入注意力
机制后
加入注意力
机制前
加入注意力
机制后
样本1
样本2
样本2
图例
Tab.6  Changes of network attention to different land cover types before and after the addition of KA
[1] Munechika C K, Warnick J S, Salvaggio C, et al. Resolution enhancement of multispectral image data to improve classification accuracy[J]. Photogrammetric Engineering and Remote Sensing, 1993, 59:67-72.
[2] Jia X, Richards J A. Cluster-space representation for hyperspectral data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(3):593-598.
[3] Li S, Kang X, Fang L, et al. Pixel-level image fusion:A survey of the state of the art[J]. Information Fusion, 2017, 33:100-112.
[4] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[5] Li X, Xu F, Lyu X, et al. A remote-sensing image pan-sharpening method based on multi-scale channel attention residual network[J]. IEEE Access, 2020, 8:27163-27177.
[6] Zhang W, Li J, Hua Z. Attention-based tri-UNet for remote sensing image pan-sharpening[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:3719-3732.
[7] Ghamisi P, Rasti B, Yokoya N, et al. Multisource and multitemporal data fusion in remote sensing:A comprehensive review of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(1):6-39.
doi: 10.1109/MGRS.2018.2890023
[8] Moser G, De Giorgi A, Serpico S B. Multiresolution supervised classification of panchromatic and multispectral images by Markov random fields and graph cuts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(9):5054-5070.
[9] Mao T, Tang H, Wu J, et al. A generalized metaphor of Chinese restaurant franchise to fusing both panchromatic and multispectral images for unsupervised classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8):4594-4604.
[10] Zhang L, Zhang L, Du B. Deep learning for remote sensing data:A technical tutorial on the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2016, 4(2):22-40.
[11] Liu X, Jiao L, Zhao J, et al. Deep multiple instance learning-based spatial-spectral classification for PAN and MS imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(1):461-473.
[12] Gaetano R, Ienco D, Ose K, et al. MRFusion:A Deep Learning architecture to fuse PAN and MS imagery for land cover mapping[J/OL]. arXiv, 2018(2018-06-29) [2024-11-03]. https://arxiv.org/abs/1806.11452v1.
url: https://arxiv.org/abs/1806.11452v1
[13] Zhu H, Ma W, Li L, et al. A dual-branch attention fusion deep network for multiresolution remote-sensing image classification[J]. Information Fusion, 2020, 58:116-131.
[14] Zhu H, Ma M, Ma W, et al. A spatial-channel progressive fusion ResNet for remote sensing classification[J]. Information Fusion, 2021, 70:72-87.
[15] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.
doi: 10.1109/TPAMI.2016.2572683 pmid: 27244717
[16] Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[C]// Proceedings of the Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference.Springer,2015:234-241.
[17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[18] Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[C]// Proceedings of the Proceedings of the European Conference on Computer Vision. Springer,2018:7132-7141.
[19] Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019:3141-3149.
[20] 国家基础地理中心. CH/T 9032—2022 全球地理信息资源数据产品规范[S]. 北京: 中国地图出版社, 2022.
[20] National Geomatics Center of China. CH/T 9032—2022 Global geographic information resource data product specification[S]. Beijing: China Cartographic Publishing House, 2022.
[21] Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[J/OL]. arXiv, 2019(2019-05-28). https://arxiv.org/abs/1905.11946v5.
url: https://arxiv.org/abs/1905.11946v5
[22] Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019:1314-1324.
[23] He K, Zhang X, Ren S, et al. Deep residual learning for image re-cognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:770-778.
[24] Chen L C, Papandreou G, Kokkinos I, et al. DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[25] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[26] Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[27] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10):3349-3364.
[28] Wang L, Li R, Zhang C, et al. UNetFormer:A UNet-like transfor-mer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190:196-214.
[29] Li R, Duan C, Zheng S, et al. MACU-net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8007205.
[30] Li R, Zheng S, Zhang C, et al. ABCNet:Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:84-98.
[31] Li R, Zheng S, Zhang C, et al. Multiattention network for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60:5607713.
[32] Li R, Zheng S, Duan C, et al. Multistage attention ResU-net for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8009205.
[33] Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE,2017:618-626.
[34] Sun X, Tian Y, Lu W, et al. From single- to multi-modal remote sensing imagery interpretation:A survey and taxonomy[J]. Science China Information Sciences, 2023, 66(4):140301.
[1] WU Zhijun, CONG Ming, XU Miaozhong, HAN Ling, CUI Jianjun, ZHAO Chaoying, XI Jiangbo, YANG Chengsheng, DING Mingtao, REN Chaofeng, GU Junkai, PENG Xiaodong, TAO Yiting. Self-learning segmentation of high-resolution remote sensing images based on visual dual-drive cognition[J]. Remote Sensing for Natural Resources, 2025, 37(5): 73-90.
[2] FANG Liuyang, YANG Changhao, SHU Dong, YANG Xuekun, CHEN Xingtong, JIA Zhiwen. Landslide detection in complex environments based on dual feature fusion[J]. Remote Sensing for Natural Resources, 2025, 37(5): 91-100.
[3] ZOU Haijing, ZOU Bin, WANG Yulong, ZHANG Bo, ZOU Lunwen. Remote sensing identification of industrial solid waste and open pits in mining areas based on the multiscale sample set optimization strategy[J]. Remote Sensing for Natural Resources, 2025, 37(3): 1-8.
[4] GUO Wei, LI Yu, JIN Haibo. Detecting ships from SAR images based on high-dimensional contextual attention and dual receptive field enhancement[J]. Remote Sensing for Natural Resources, 2025, 37(3): 104-112.
[5] CHEN Min, PENG Shuan, WANG Tao, WU Xuefang, LIU Runpu, CHEN Yushuo, FANG Yanru, YANG Pingjian. A comparative study of water body classification of wetlands based on hyperspectral images from the ZY1-02D satellite: A case study of the Baiyangdian wetland[J]. Remote Sensing for Natural Resources, 2025, 37(3): 133-141.
[6] CEHN Lanlan, FAN Yongchao, XIAO Haiping, WAN Junhui, CHEN Lei. Predicting surface subsidence in large-scale mining areas based on time-series InSAR and the IRIME-LSTM model[J]. Remote Sensing for Natural Resources, 2025, 37(3): 245-252.
[7] ZHENG Zongsheng, GAO Meng, ZHOU Wenhuan, WANG Zhenghan, HUO Zhijun, ZHANG Yuewei. Densely connected multiscale semantic segmentation for land cover based on the iterative optimization strategy for samples[J]. Remote Sensing for Natural Resources, 2025, 37(2): 11-18.
[8] PANG Min. An intelligent platform for extracting patches from multisource domestic satellite images and its application[J]. Remote Sensing for Natural Resources, 2025, 37(2): 148-154.
[9] HUANG Chuan, LI Yaqin, QI Yueran, WEI Xiaoyan, SHAO Yuanzheng. A hyperspectral unmixing and few-shot classification method based on 3DCAE network[J]. Remote Sensing for Natural Resources, 2025, 37(1): 8-14.
[10] ZHANG Ruirui, XIA Lang, CHEN Liping, DING Chenchen, ZHENG Aichun, HU Xinmiao, YI Tongchuan, CHEN Meixiang, CHEN Tianen. Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing[J]. Remote Sensing for Natural Resources, 2024, 36(3): 216-224.
[11] WEN Quan, LI Lu, XIONG Li, DU Lei, LIU Qingjie, WEN Qi. A review of water body extraction from remote sensing images based on deep learning[J]. Remote Sensing for Natural Resources, 2024, 36(3): 57-71.
[12] BAI Shi, TANG Panpan, MIAO Zhao, JIN Caifeng, ZHAO Bo, WAN Haoming. Information extraction of landslides based on high-resolution remote sensing images and an improved U-Net model: A case study of Wenchuan, Sichuan[J]. Remote Sensing for Natural Resources, 2024, 36(3): 96-107.
[13] SONG Shuangshuang, XIAO Kaifei, LIU Zhaohua, ZENG Zhaoliang. A YOLOv5-based target detection method using high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(2): 50-59.
[14] LI Wanyue, LOU Debo, WANG Chenghui, LIU Huan, ZHANG Changqing, FAN Yinglin, DU Xiaochuan. A granitic pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2): 89-96.
[15] LI Xintong, SHI Lan, CHEN Duoyan. A deep learning-based study on downscaling of GPM products in Fujian-Zhejiang-Jiangxi area[J]. Remote Sensing for Natural Resources, 2023, 35(4): 105-113.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
京ICP备05055290号-2
Copyright © 2017 Remote Sensing for Natural Resources
Support by Beijing Magtech