基于卷积与交叉条纹Transformer混合编码器的云检测方法

吝欣然; 王倩; 秦建峰; 杨维发; 颜国跑; 袁文波

doi:10.6046/zrzyyg.2025021

基于卷积与交叉条纹Transformer混合编码器的云检测方法

A cloud detection method based on convolutional and cross-stripe transformer hybrid encoders

摘要

摘要: 云检测是遥感图像处理领域的重要研究方向，在气象监测、环境评估、农业管理和军事侦察等领域广泛应用。准确检测和分割云区域对提升遥感数据的利用效率至关重要。然而，云的形态复杂多样，包括卷云、积云、层云等不同类型，其厚度、透明度和高度也各不相同。针对云形态复杂多变的特点，该文设计了一种基于卷积与Transformer混合编码器的云检测模型UCT-Net。UCT-Net基于U型网络结构，在编码器部分融合卷积和Transformer编码器来联合提取卫星云图的特征。同时，针对云层特征的多样性，该文设计了一种基于交叉条纹的Transformer模块，以增强对不同形态云的适应性。此外，还提出了一种交叉条纹与卷积融合模块(cross stripe encoder and conv encoder merge module, CCM)，有效促进了卷积编码器与交叉条纹Transformer编码器的深度融合。利用GF-1和GF-2卫星数据来源的GF12MS WHU数据集以及Google Earth提供的HRC WHU数据集进行了评估和测试，实验结果表明，UCT-Net在GF12MS WHU和HRC WHU数据集上的精确率分别为92.70%和94.20%，均优于经典语义分割算法，展现了其在云检测任务中的优越性能。

Abstract: Cloud detection serves as a critical research direction in remote sensing image processing, with broad applications in meteorological monitoring, environmental assessment, agricultural management, and military reconnaissance. Accurate detection and segmentation of cloud regions hold significant importance in improving the utilization efficiency of remote sensing data. However, the clouds are characterized by diverse and complex morphology, including cirrus, cumulus, and stratus, with significant variations in their thickness, transparency, and altitude. In response to these characteristics of clouds, this study designed a U-striped model for cloud detection based on convolutional and cross-stripe Transformer hybrid encoders (UCT-Net). Based on a U-shaped network architecture, the proposed UCT-Net incorporates both convolutional and Transformer encoders, jointly extracting features from satellite cloud imagery. Specifically, to enhance adaptability to diverse cloud morphologies, this study further designed a cross-stripe Transformer module to capture variations in cloud morphology effectively. Additionally, it also proposed a dual-weighted attention mechanism integrating texture and channel information, named the cross stripe encoder and conv encoder merge module(CCM), effectively facilitating the deep fusion of convolutional and cross-stripe Transformer-based encoders. The UCT-Net was evaluated and validated on two datasets: the GF12MS WHU dataset derived from GF-1 and GF-2 satellite data, and the HRC WHU dataset sourced from Google Earth. The results show that the UCT-Net achieved an precision of 92.70% on the GF12MS WHU dataset and 94.20% on the HRC WHU dataset, outperforming classical semantic segmentation algorithms. This demonstrates the superior performance of the UCT-Net in cloud detection tasks.

HTML全文

参考文献(41)

施引文献

资源附件(0)