Abstract:
Extracting information on water bodies from unmanned aerial vehicle (UAV) images faces challenges, such as occlusion-induced interference, misclassification caused by mixed pixels, and omission of tiny water bodies. To address these issues, this study proposed a high-precision water body segmentation network that integrates the residual structure with an attention mechanism. First, a deep encoder with ResNet50 (i.e., a residual network with 50 layers) as the architecture was constructed to enhance semantic feature representation through residual connections. Second, a channel-spatial dual-dimensional attention mechanism was introduced into skip connections to achieve dynamic feature calibration. The channel attention reweighed the water body saliency, while the spatial attention focused on areas sensitive to water body boundaries. Third, a hybrid Focal-Dice loss function was designed to reduce boundary overlaps of tiny water bodies through hard sample mining, thereby achieving co-optimization of class imbalance and spatial structural information. The proposed network was compared with four mainstream models, i.e., fully convolutional network (FCN), SegNet, DeepLabV3+, and classic U-Net. The results from qualitative analysis and quantitative evaluation demonstrate that the proposed network outperformed all the models, yielding a precision of 98.29% and a recall of 97.00%. Therefore, the proposed network can provide a novel solution that balances accuracy, efficiency, and robustness for water body information extraction from UAV remote sensing images.