End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network

doi:10.6046/zrzyyg.2024208

Abstract
Figures/Tables
References
Related Articles
Metrics

Download: PDF(5306 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Multispectral （MS） and panchromatic （PAN） images serve as primary data sources for visible-near-infrared optical remote sensing imagery. In a typical land cover classification workflow，the spatial resolution of MS images is generally enhanced using pixel-level fusion methods，followed by image classification. However，the pixel-level fusion process is characterized by considerable time consumption and inconsistency with the optimization objectives of land cover classification，failing to meet the demand for end-to-end remote sensing image classification. To address these challenges，this paper proposed a dual-stream fully convolutional neural network，DSEUNet，which obviates the need for pixel-level fusion. Specifically，two branches were constructed based on the EfficientNet-B3 network to extract features from PAN and MS images，respectively. It was followed by feature-level fusion and decoding，thus outputting the ultimate classification results. Considering that PAN and MS images focus on different features of land cover elements，a spatial attention mechanism was incorporated in the PAN branch to enhance the perception of spatial information，such as details and edges. Moreover，a channel attention mechanism was incorporated in the MS branch to improve the perception of reflectance differences across multiple bands. Experiments on the 10-meter land cover dataset and ablation studies of the network structure demonstrate that the proposed network exhibited higher classification accuracy and faster inference speed. With the same backbone network，DSEUNet outperformed traditional pixel-level fusion-based classification methods，with an increase of 1.62 percentage points in mIoU，1.36 percentage points in mFscore，and 1.49 percentage points in Kappa coefficient，as well as a 17.69% improvement in inference speed.

Keywords land cover classification deep learning dual-stream network panchromatic （PAN） image multispectral （MS） image

ZTFLH:

TP79

Issue Date: 28 October 2025

	Service

	E-mail this article
	E-mail Alert
	RSS
	Articles by authors

	Yinglong LI
	Yupeng DENG
	Yunlong KONG
	Jingbo CHEN
	Yu MENG
	Diyou LIU

Cite this article:

Yinglong LI,Yupeng DENG,Yunlong KONG, et al. End-to-end land cover classification based on panchromatic-multispectral dual-stream convolutional network[J]. Remote Sensing for Natural Resources, 2025, 37(5): 152-161.

URL:

https://www.gtzyyg.com/EN/10.6046/zrzyyg.2024208 OR https://www.gtzyyg.com/EN/Y2025/V37/I5/152

Fig.1 Examples of multispectral，panchromatic and label images in the dataset

Fig.2 Network structure of DSEUNet

Fig.3 The structure of the fusion module

Fig.4 Structure of spatial attention mechanism

Fig.5 Structure of channel attention mechanism

Tab.1 Classification index of different methods on 10 m land cover dataset （%）

Tab.2 Visualization of results on typical test set images using different methods

Tab.3 Network efficiency indicators of different models

Tab.4 Precision indexes using different fusion methods （%）

Tab.5 The influence of attention mechanism on the precision index of DSEUNet （%）

Tab.6 Changes of network attention to different land cover types before and after the addition of KA

[1]	Munechika C K, Warnick J S, Salvaggio C, et al. Resolution enhancement of multispectral image data to improve classification accuracy[J]. Photogrammetric Engineering and Remote Sensing, 1993, 59:67-72.
[2]	Jia X, Richards J A. Cluster-space representation for hyperspectral data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2002, 40(3):593-598.
[3]	Li S, Kang X, Fang L, et al. Pixel-level image fusion:A survey of the state of the art[J]. Information Fusion, 2017, 33:100-112.
[4]	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[5]	Li X, Xu F, Lyu X, et al. A remote-sensing image pan-sharpening method based on multi-scale channel attention residual network[J]. IEEE Access, 2020, 8:27163-27177.
[6]	Zhang W, Li J, Hua Z. Attention-based tri-UNet for remote sensing image pan-sharpening[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:3719-3732.
[7]	Ghamisi P, Rasti B, Yokoya N, et al. Multisource and multitemporal data fusion in remote sensing:A comprehensive review of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(1):6-39. doi: 10.1109/MGRS.2018.2890023
[8]	Moser G, De Giorgi A, Serpico S B. Multiresolution supervised classification of panchromatic and multispectral images by Markov random fields and graph cuts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(9):5054-5070.
[9]	Mao T, Tang H, Wu J, et al. A generalized metaphor of Chinese restaurant franchise to fusing both panchromatic and multispectral images for unsupervised classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8):4594-4604.
[10]	Zhang L, Zhang L, Du B. Deep learning for remote sensing data:A technical tutorial on the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2016, 4(2):22-40.
[11]	Liu X, Jiao L, Zhao J, et al. Deep multiple instance learning-based spatial-spectral classification for PAN and MS imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(1):461-473.
[12]	Gaetano R, Ienco D, Ose K, et al. MRFusion:A Deep Learning architecture to fuse PAN and MS imagery for land cover mapping[J/OL]. arXiv, 2018(2018-06-29) [2024-11-03]. https://arxiv.org/abs/1806.11452v1. url: https://arxiv.org/abs/1806.11452v1
[13]	Zhu H, Ma W, Li L, et al. A dual-branch attention fusion deep network for multiresolution remote-sensing image classification[J]. Information Fusion, 2020, 58:116-131.
[14]	Zhu H, Ma M, Ma W, et al. A spatial-channel progressive fusion ResNet for remote sensing classification[J]. Information Fusion, 2021, 70:72-87.
[15]	Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651. doi: 10.1109/TPAMI.2016.2572683 pmid: 27244717
[16]	Ronneberger O, Fischer P, Brox T. U-net:Convolutional networks for biomedical image segmentation[C]// Proceedings of the Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference.Springer,2015:234-241.
[17]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[18]	Woo S, Park J, Lee J Y, et al. CBAM:Convolutional block attention module[C]// Proceedings of the Proceedings of the European Conference on Computer Vision. Springer,2018:7132-7141.
[19]	Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019:3141-3149.
[20]	国家基础地理中心. CH/T 9032—2022 全球地理信息资源数据产品规范[S]. 北京: 中国地图出版社, 2022.
[20]	National Geomatics Center of China. CH/T 9032—2022 Global geographic information resource data product specification[S]. Beijing: China Cartographic Publishing House, 2022.
[21]	Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[J/OL]. arXiv, 2019(2019-05-28). https://arxiv.org/abs/1905.11946v5. url: https://arxiv.org/abs/1905.11946v5
[22]	Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019:1314-1324.
[23]	He K, Zhang X, Ren S, et al. Deep residual learning for image re-cognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:770-778.
[24]	Chen L C, Papandreou G, Kokkinos I, et al. DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[25]	Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[26]	Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615 pmid: 28060704
[27]	Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10):3349-3364.
[28]	Wang L, Li R, Zhang C, et al. UNetFormer:A UNet-like transfor-mer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190:196-214.
[29]	Li R, Duan C, Zheng S, et al. MACU-net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8007205.
[30]	Li R, Zheng S, Zhang C, et al. ABCNet:Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:84-98.
[31]	Li R, Zheng S, Zhang C, et al. Multiattention network for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60:5607713.
[32]	Li R, Zheng S, Duan C, et al. Multistage attention ResU-net for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:8009205.
[33]	Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE,2017:618-626.
[34]	Sun X, Tian Y, Lu W, et al. From single- to multi-modal remote sensing imagery interpretation:A survey and taxonomy[J]. Science China Information Sciences, 2023, 66(4):140301.

[1]	WU Zhijun, CONG Ming, XU Miaozhong, HAN Ling, CUI Jianjun, ZHAO Chaoying, XI Jiangbo, YANG Chengsheng, DING Mingtao, REN Chaofeng, GU Junkai, PENG Xiaodong, TAO Yiting. Self-learning segmentation of high-resolution remote sensing images based on visual dual-drive cognition[J]. Remote Sensing for Natural Resources, 2025, 37(5): 73-90.
[2]	FANG Liuyang, YANG Changhao, SHU Dong, YANG Xuekun, CHEN Xingtong, JIA Zhiwen. Landslide detection in complex environments based on dual feature fusion[J]. Remote Sensing for Natural Resources, 2025, 37(5): 91-100.
[3]	ZOU Haijing, ZOU Bin, WANG Yulong, ZHANG Bo, ZOU Lunwen. Remote sensing identification of industrial solid waste and open pits in mining areas based on the multiscale sample set optimization strategy[J]. Remote Sensing for Natural Resources, 2025, 37(3): 1-8.
[4]	GUO Wei, LI Yu, JIN Haibo. Detecting ships from SAR images based on high-dimensional contextual attention and dual receptive field enhancement[J]. Remote Sensing for Natural Resources, 2025, 37(3): 104-112.
[5]	CHEN Min, PENG Shuan, WANG Tao, WU Xuefang, LIU Runpu, CHEN Yushuo, FANG Yanru, YANG Pingjian. A comparative study of water body classification of wetlands based on hyperspectral images from the ZY1-02D satellite: A case study of the Baiyangdian wetland[J]. Remote Sensing for Natural Resources, 2025, 37(3): 133-141.
[6]	CEHN Lanlan, FAN Yongchao, XIAO Haiping, WAN Junhui, CHEN Lei. Predicting surface subsidence in large-scale mining areas based on time-series InSAR and the IRIME-LSTM model[J]. Remote Sensing for Natural Resources, 2025, 37(3): 245-252.
[7]	ZHENG Zongsheng, GAO Meng, ZHOU Wenhuan, WANG Zhenghan, HUO Zhijun, ZHANG Yuewei. Densely connected multiscale semantic segmentation for land cover based on the iterative optimization strategy for samples[J]. Remote Sensing for Natural Resources, 2025, 37(2): 11-18.
[8]	PANG Min. An intelligent platform for extracting patches from multisource domestic satellite images and its application[J]. Remote Sensing for Natural Resources, 2025, 37(2): 148-154.
[9]	HUANG Chuan, LI Yaqin, QI Yueran, WEI Xiaoyan, SHAO Yuanzheng. A hyperspectral unmixing and few-shot classification method based on 3DCAE network[J]. Remote Sensing for Natural Resources, 2025, 37(1): 8-14.
[10]	ZHANG Ruirui, XIA Lang, CHEN Liping, DING Chenchen, ZHENG Aichun, HU Xinmiao, YI Tongchuan, CHEN Meixiang, CHEN Tianen. Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing[J]. Remote Sensing for Natural Resources, 2024, 36(3): 216-224.
[11]	WEN Quan, LI Lu, XIONG Li, DU Lei, LIU Qingjie, WEN Qi. A review of water body extraction from remote sensing images based on deep learning[J]. Remote Sensing for Natural Resources, 2024, 36(3): 57-71.
[12]	BAI Shi, TANG Panpan, MIAO Zhao, JIN Caifeng, ZHAO Bo, WAN Haoming. Information extraction of landslides based on high-resolution remote sensing images and an improved U-Net model: A case study of Wenchuan, Sichuan[J]. Remote Sensing for Natural Resources, 2024, 36(3): 96-107.
[13]	SONG Shuangshuang, XIAO Kaifei, LIU Zhaohua, ZENG Zhaoliang. A YOLOv5-based target detection method using high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(2): 50-59.
[14]	LI Wanyue, LOU Debo, WANG Chenghui, LIU Huan, ZHANG Changqing, FAN Yinglin, DU Xiaochuan. A granitic pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2): 89-96.
[15]	LI Xintong, SHI Lan, CHEN Duoyan. A deep learning-based study on downscaling of GPM products in Fujian-Zhejiang-Jiangxi area[J]. Remote Sensing for Natural Resources, 2023, 35(4): 105-113.

Viewed

Full text

Abstract

Cited

Shared

Discussed