Self-learning segmentation of high-resolution remote sensing images based on visual dual-drive cognition
WU Zhijun1(), CONG Ming1(), XU Miaozhong2, HAN Ling1, CUI Jianjun1, ZHAO Chaoying1, XI Jiangbo1, YANG Chengsheng1, DING Mingtao1, REN Chaofeng1, GU Junkai1, PENG Xiaodong1, TAO Yiting2
1. College of Geological Engineering and Geomatics,Chang’an University,Xi’an 710064,China 2. State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,Wuhan 430072,China
The current high-resolution remote sensing images involve complex scenes that are difficult to analyze. Meanwhile,owing to the diverse scenes,there is a lack of accurate reference obtained from the sample database. Therefore,this paper proposed a self-learning segmentation method for high-resolution remote sensing images,with reference to the visual dual-drive cognition mechanism. Based on the principle of visual perception,this method interpreted the typical ground objects in the scene through unsupervised adaptive analysis. In addition,it achieved self-learning identification of typical ground objects by integrating a neural network. Finally,the segmentation results were self-checked and corrected by combining unsupervised analysis and neural network learning. Using real high-resolution remote sensing image data containing complex ground scenes,the comparative experiments were conducted between the proposed method and two popular deep neural network segmentation methods:mask region-based convolutional neural network (Mask R-CNN) and scalable vision transformer (ScalableViT). The results showed that the proposed method can maintain robust and reliable segmentation accuracy,and outperformed others in terms of ground object cognition,generalization performance,and anti-interference ability. As such,it proved to be a cost-effective and practical approach.
Pan X, Zhang C, Xu J, et al. Simplified object-based deep neural network for very high resolution remote sensing image classification[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181:218-237.
Cheng J H, Huang Z Y, Wang J R, et al. The automatic determination method of the optimal segmentation result of high-spatial resolution remote sensing image[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(5):658-667.
doi: 10.11947/j.AGCS.2022.20210423
Wu Q Q, Wang S, Wang B, et al. Road extraction method of high-resolution remote sensing image on the basis of the spatial information perception semantic segmentation model[J]. National Remote Sensing Bulletin, 2022, 26(9):1872-1885.
[4]
Liu Y, Li E, Wang S, et al. Superpixel segmentation of high-resolution remote sensing image based on feature reconstruction method by salient edges[J]. Journal of Applied Remote Sensing, 2023, 17(2):026516.
Shao Z F, Sun Y M, Xi J B, et al. Intelligent optimization learning for semantic segmentation of high spatial resolution remote sensing images[J]. Geomatics and Information Science of Wuhan University, 2022, 47(2):234-241.
Shi X. Hierarchical mixture model based high-resolution remote sensing image segmentation method[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(1):168.
doi: 10.11947/j.AGCS.2023.20210147
[7]
Cao Y, Huang X. A coarse-to-fine weakly supervised learning method for green plastic cover segmentation using high-resolution remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 188:157-176.
[8]
Su Y, Cheng J, Bai H, et al. Semantic segmentation of very-high-resolution remote sensing images via deep multi-feature learning[J]. Remote Sensing, 2022, 14(3):533.
[9]
Ding C, Weng L, Xia M, et al. Non-local feature search network for building and road segmentation of remote sensing image[J]. ISPRS International Journal of Geo-Information, 2021, 10(4):245.
[10]
Ju H, Bi F, Bian M, et al. Multiscale feature fusion network for automatic port segmentation from remote sensing images[J]. Journal of Applied Remote Sensing, 2022, 16(4):044506.
Zhou R R, Liu Y, Zhou Y F, et al. Automatic building extraction from remote sensing images based on semantic segmentation[J]. Henan Science, 2023, 41(4):612-618.
Liu Y, Hao X Y, Zhao C, et al. Remote sensing image flood disaster detection method based on classification and semantic segmentation[J]. Journal of Engineering of Heilongjiang University, 2023, 14(1):76-82.
Wang Z H, Zhang X Y, Liu Z X, et al. Improved lattice Boltzmann parallel model for remote sensing object segmentation[J]. Remote Sensing Information, 2021, 36(4):1-6.
Liu S Y, Li L, Te R G, et al. Threshold segmentation algorithm based on histogram region growing for remote sensing images[J]. Bulletin of Surveying and Mapping, 2021(2):25-29.
doi: 10.13474/j.cnki.11-2246.2021.0037
Zhang H Z. Analysis of remote sensing image segmentation based on wavelet domain triple MRF segmentation algorithm[J]. Beijing Surveying and Mapping, 2021, 35(7):866-869.
[17]
Song Y, Qu J. Real-time segmentation of remote sensing images with a combination of clustering and Bayesian approaches[J]. Journal of Real-Time Image Processing, 2021, 18(5):1541-1554.
Su T F. A comparative study on semantic segmentation-orientated deep convolutional networks for remote sensing image-based farmland classification:A case study of the Hetao irrigation district[J]. Remote Sensing for Natural Resources, 2024, 36(4):210-217.doi:10.6046/zrzyyg.2023150.
Zhang R R, Xia L, Chen L P, et al. Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing[J]. Remote Sensing for Natural Resources, 2024, 36(3):216-224.doi:10.6046/zrzyyg.2023094.
Yang J, Yu X Z. Semantic segmentation of high-resolution remote sensing images based on improved FuseNet combined with atrous convolution[J]. Geomatics and Information Science of Wuhan University, 2022, 47(7):1071-1080.
[21]
Xi J B, Ersoy O K, Cong M, et al. Wide and deep Fourier neural network for hyperspectral remote sensing image classification[J]. Remote Sensing, 2022, 14(12):2931-2931.
Pan J P, Li X, Sun B W, et al. Detection of new construction land change based on attention intensive connection pyramid network[J]. Bulletin of Surveying and Mapping, 2022(3):41-46,59.
doi: 10.13474/j.cnki.11-2246.2022.0075
Sun H Q, Pan C, He L M, et al. Remote sensing image semantic segmentation network based on multimodal feature fusion[J]. Computer Engineering and Applications, 2022, 58(24):256-264.
doi: 10.3778/j.issn.1002-8331.2207-0010
[24]
Sun L, Cheng S, Zheng Y, et al. SPANet:Successive pooling attention network for semantic segmentation of remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15:4045-4057.
[25]
Cong M, Xi J, Han L, et al. Multi-resolution classification network for high-resolution UAV remote sensing images[J]. Geocarto International, 2022, 37(11):3116-3140.
Liu C C, Ge X S, Wu Y B, et al. A method for information extraction of buildings from remote sensing images based on hybrid attention mechanism and Deeplabv3+[J]. Remote Sensing for Natural Resources, 2025, 37(1):31-37.doi:10.6046/zrzyyg.2023295.
Qu H C, Liang X. Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement[J]. Remote Sensing for Natural Resources, 2024, 36(4):107-116.doi:10.6046/zrzyyg.2023146.
Feng W M, Zhang X C, Sun Y, et al. High-resolution remote sens-ing image change detection network with Transformer structure[J]. Bulletin of Surveying and Mapping, 2022(8):36-40,92.
[29]
He Q B, Sun X, Diao W H, et al. Multimodal remote sensing image segmentation with intuition-inspired hypergraph modeling[J]. IEEE Transactions on Image Processing, 2023, 32:1474-1487.
Ma Y, Gulimila K. Research review of image semantic segmentation method in high-resolution remote sensing image interpretation[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7):1526-1548.
Pan J J, Shen L, Yan X, et al. An adversarial learning-based unsupervised domain adaptation method for semantic segmentation of high-resolution remote sensing images[J]. Remote Sensing for Natural Resources, 2024, 36(4):149-157.doi:10.6046/zrzyyg.2023169.
[32]
Katsuki F, Constantinidis C. Bottom-up and top-down attention:Different processes and overlapping neural systems[J]. The Neuroscientist:A Review Journal Bringing Neurobiology,Neurology and Psychiatry, 2014, 20(5):509-521.
[33]
Katsumi Y, Putcha D, Eckbo R, et al. Anterior dorsal attention network tau drives visual attention deficits in posterior cortical atrophy[J]. Brain, 2023, 146(1):295-306.
[34]
Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10):1915-1926.
pmid: 22201056
Zheng S J, Qiu S, Li Q L, et al. Fourier transform channel attention network for cholangiocarcinoma hyperspectral image segmentation[J]. Journal of Image and Graphics, 2021, 26(8):1836-1846.
[36]
Zhou Z, Zhou Y, Wang D, et al. Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing, 2021, 453:50-59.
[37]
杨开富. 前端视觉通路信息加工的计算模型及应用研究[D]. 成都: 电子科技大学, 2016.
Yang K F. Computational models and applications of the early stages of biological visual system[D]. Chengdu: University of Electronic Science and Technology of China, 2016.
[38]
Marks W B, Dobelle W H, Macnichol Jr E F. Visual pigments of single primate cones[J]. Science, 1964, 143(3611):1181-1183.
pmid: 14108303
[39]
Li C Y, Pei X, Zhow Y X, et al. Role of the extensive area outside the x-cell receptive field in brightness information transmission[J]. Vision Research, 1991, 31(9):1529-1540.
pmid: 1949622
[40]
Grigorescu C, Petkov N, Westenberg M A. Contour detection based on nonclassical receptive field inhibition[J]. IEEE Transactions on Image Processing, 2003, 12(7):729-739.
doi: 10.1109/TIP.2003.814250
pmid: 18237948
Song X F. Application of digital pulse compression technique in Radar[J]. Modern Electronics Technique, 2009, 32(12):118-120.
[42]
Michelson A A. Studies in optics[M]. Mineola,NY: Dover Publications, 1995.
[43]
Damera-Venkata N, Kite T D, Geisler W S, et al. Image quality assessment based on a degradation model[J]. IEEE Transactions on Image Processing, 2000, 9(4):636-650.
doi: 10.1109/83.841940
pmid: 18255436
[44]
Cong M, Cui J, Peng X, et al. Preliminary analytical method for unsupervised remote sensing image classification based on visual perception and a force field[J]. Geocarto International, 2018, 33(12):1350-1366.
[45]
Hubel D H, Wiesel T N. Receptive fields of single neurones in the cat’s striate cortex[J]. The Journal of Physiology, 1959, 148(3):574-591.
[46]
Hubel D H, Wiesel T N. Receptive fields and functional architecture of monkey striate cortex[J]. The Journal of Physiology, 1968, 195(1):215-243.
[47]
Hubel D H, Wiesel T N. The period of susceptibility to the physiological effects of unilateral eye closure in kittens[J]. The Journal of Physiology, 1970, 206(2):419-436.
[48]
Hubel D H, Wiesel T N. Ferrier lecture. Functional architecture of macaque monkey visual cortex[J]. Proceedings of the Royal Society of London Series B,Biological Sciences, 1977, 198(1130):1-59.
Xu A L, Du D, Wang H H, et al. Optical ship target detection method combining hierarchical search and visual residual network[J]. Opto-Electronic Engineering, 2021, 48(4):39-46.
Xu M Z, Cong M, Wan L J, et al. A methodology of image segmentation for high resolution remote sensing image based on visual system and Markov random field[J]. Acta Geodaetica et Cartographica Sinica, 2015, 44(2):198-205,213.
doi: 10.11947/j.AGCS.2015.20130453
[51]
Zheng R, Zhong Y, Yan S, et al. MsVRL:Self-supervised multiscale visual representation learning via cross-level consistency for medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2023, 42(1):91-102.
[52]
Ibbotson M, Krekelberg B. Visual perception and saccadic eye movements[J]. Current Opinion in Neurobiology, 2011, 21(4):553-558.
doi: 10.1016/j.conb.2011.05.012
pmid: 21646014
[53]
Stewart E E M, Valsecchi M, Schütz A C. A review of interactions between peripheral and foveal vision[J]. Journal of Vision, 2020, 20(12):2.
doi: 10.1167/jov.20.12.2
pmid: 33141171
[54]
Rucci M, Iovin R, Poletti M, et al. Miniature eye movements enhance fine spatial detail[J]. Nature, 2007, 447(7146):852-855.
[55]
McCamy M B, Otero-Millan J, Macknik S L, et al. Microsaccadic efficacy and contribution to foveal and peripheral vision[J]. The Journal of Nearoscience, 2012, 32(27):9194-9204.
[56]
Martinez-Conde S, Otero-Millan J, Macknik S L. The impact of microsaccades on vision:Towards a unified theory of saccadic function[J]. Nature Reviews Neuroscience, 2013, 14(2):83-96.
doi: 10.1038/nrn3405
pmid: 23329159
Chen S J, Zhang L H, Xu X D, et al. Studies on factors influence UV-absorbance and color value of microencapsulated lutein[J]. China Food Additives, 2013, 24(4):100-102.
[59]
CIE. CIE 217:2016 Recommended method for evaluating the performance of colour-difference formulae[S]. Vienna:CIE, 2016.
[60]
Chen Q, Wu Q, Wang J, et al. MixFormer:Mixing features across windows and dimensions[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).June 18-24,2022,New Orleans,LA,USA.IEEE,2022:5239-5249.
[61]
Yang R, Ma H L, Wu J, et al. ScalableViT:Rethinking the context-oriented generalization of vision Transformer[C]// Computer Vision-ECCV 2022 Proceedings. Cham: Springer Nature Switzerland,2022:480-496.
Li S Q, Yao G Q. A landslide detection method using CNN- and SETR-based feature fusion[J]. Remote Sensing for Natural Resources, 2024, 36(4):158-164.doi:10.6046/zrzyyg.2023117.
Li W Y, Lou D B, Wang C H, et al. A granitic pegmatite information extraction method based on improved U-Net[J]. Remote Sensing for Natural Resources, 2024, 36(2):89-96.doi:10.6046/zrzyyg.2022500.
Chen J X, Xiao D S, Chen H Y. A boundary guidance and cross-scale information interaction network for water body extraction from remote sensing images[J]. Remote Sensing for Natural Resources, 2025, 37(1):15-23.doi:10.6046/zrzyyg.2023230.
Yao Q L, Hu X, Lei H. Object detection in remote sensing images using multiscale convolutional neural networks[J]. Acta Optica Sinica, 2019, 39(11):346-353.
Lei D J, Du J H, Zhang L P, et al. Multi-stream architecture and multi-scale convolutional neural network for remote sensing image fusion[J]. Journal of Electronics & Information Technology, 2022, 44(1):237-244.
[67]
Cong M, Cui J, Chen S, et al. Enhanced shuffle attention network based on visual working mechanism for high-resolution remote sensing image classification[J]. Geocarto International, 2022, 37(27):18731-18766.
[68]
Guo M H, Lu C Z, Hou Q B, et al. SegNeXt:Rethinking convolutional attention design for semantic segmentation[J/OL]. arXiv, 2022(2022-09-18). https://arxiv.org/abs/2209.08575v1.
[69]
Cong M, Xi J, Ding M, et al. Two-pathway anti-interference neural network based on the retinal perception mechanism for classification of remote sensing images from unmanned aerial vehicles[J]. Journal of Applied Remote Sensing, 2020, 14(2):026511.
[70]
Sharma A K, Nandal A, Dhaka A, et al. Enhanced watershed segmentation algorithm-based modified ResNet50 model for brain tumor detection[J]. BioMed Research International, 2022, 2022:7348344.
[71]
Zhao X Q, Jia H P, Pang Y W, et al. M2SNet:Multi-scale in multi-scale subtraction network for medical image segmentation[J/OL]. arXiv, 2023(2023-03-20). https://arxiv.org/abs/2303.10894.
[72]
Mahmoud A, Mohamed S, El-Khoribi R, et al. Object detection using adaptive mask RCNN in optical remote sensing images[J]. International Journal of Intelligent Engineering and Systems, 2020, 13(1):65-76.
[73]
Song K S. Globally convergent algorithms for estimating generalized gamma distributions in fast signal and image processing[J]. IEEE Transactions on Image Processing, 2008, 17(8):1233-1250.
Bai S, Tang P P, Miao Z, et al. Information extraction of landslides based on high-resolution remote sensing images and an improved U-Net model:A case study of Wenchuan,Sichuan[J]. Remote Sensing for Natural Resources, 2024, 36(3):96-107.doi:10.6046/zrzyyg.2023132.
[75]
Pontius R G Jr, Millones M. Death to Kappa:Birth of quantity disagreement and allocation disagreement for accuracy assessment[J]. International Journal of Remote Sensing, 2011, 32(15):4407-4429.