自然资源遥感, 2024, 36(3): 216-224 doi: 10.6046/zrzyyg.2023094

技术应用

深度语义分割网络无人机遥感松材线虫病变色木识别

张瑞瑞,1,2,3, 夏浪1,2,3, 陈立平,1,2,3, 丁晨琛1,2,3, 郑爱春4, 胡新苗4, 伊铜川1,2,3, 陈梅香1,2,3, 陈天恩1,2,3,5

1.北京市农林科学院智能装备技术研究中心,北京 100097

2.国家农业智能装备工程技术研究中心,北京 100097

3.国家农业航空应用技术国际联合研究中心,北京 100097

4.南京市浦口区林业站,南京 211899

5.农芯(南京)智慧农业研究院有限公司,南京 211899

Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing

ZHANG Ruirui,1,2,3, XIA Lang1,2,3, CHEN Liping,1,2,3, DING Chenchen1,2,3, ZHENG Aichun4, HU Xinmiao4, YI Tongchuan1,2,3, CHEN Meixiang1,2,3, CHEN Tianen1,2,3,5

1. Beijing Research Center of Intelligent Equipment for Agriculture, Beijing Academy of Agricultural and Forestry Sciences, Beijing 100097, China

2. National Research Center of Intelligent Equipment for Agriculture, Beijing Academy of Agricultural and Forestry Sciences, Beijing 100097, China

3. National Center for International Research on Agricultural Aerial Application Technology, Beijing Academy of Agricultural and Forestry Sciences, Beijing 100097, China

4. Nanjing Pukou District Forestry Station, Nanjing 211899, China

5. Nongxin (Nanjing) Intelligent Agricultural Research Institute Co., Ltd., Nanjing 211899, China

通讯作者: 陈立平(1973-),女,博士,研究员,主要研究方向为农业智能装备技术及应用研究。Email:chenlp@nercita.org.cn

责任编辑: 陈昊旻

收稿日期: 2023-04-18   修回日期: 2023-09-22  

基金资助: 国家重点研发计划项目“松材线虫病灾变机制与可持续防控技术”(2021YFD1400900)
南京市企业院士工作站关键核心技术攻关项目“林业松材线虫病害智慧防控系统研发与应用”
北京市农林科学院创新能力建设项目“农林重大虫害监测预警智能化平台研究与开发”(KJCX20230205)

Received: 2023-04-18   Revised: 2023-09-22  

作者简介 About authors

张瑞瑞(1983-),男,博士,研究员,主要研究方向为农林航空应用技术研究。Email: zhangrr@nercita.org.cn

摘要

松材线虫病是危害我国林业资源的主要病害,研究深度语义分割网络无人机遥感技术可提高松材线虫病变色木识别准确率,为提升和保护林业资源质量提供技术支撑。该文以青岛崂山松林为研究区,通过固定翼无人机航拍获取区域无人机松材线虫病疑似变色木影像,以全卷积网络(fully convolutional networks,FCN),U-Net,DeepLabV3+和OCNet 4种深度语义分割模型为研究对象,选用召回率(Recall)、精确率(Precision)、交并比(intersection over union,IoU)和F1值评估各模型分割精度。航拍飞行获得2 688张无人机影像,通过手动标记和样本扩增生成训练样本28 800个。4种网络均能够较好识别松材线虫病变色木,无显著误报,并且深度语义模型对颜色相近的地物,如岩石、黄色裸土等有较好的辨别结果。总体上,DeepLabV3+具有最高的变色木分割精度,IoU与F1值分别为0.711和0.829; FCN模型分割精度最低,IoU与F1值分别为0.699和0.812; DeepLabV3+训练耗时最低,达到27.2 ms/幅; FCN预测耗时最低,达到7.2 ms/幅,但分割变色木的边缘精度最低。以3种特征提取网络ResNet50,ResNet101和ResNet152为前端特征提取网络构建的DeepLabV3+模型变色木识别IoU值分别为0.711,0.702和0.702,F1值分别为0.829,0.822和0.820。DeepLabV3+比DeepLabV3网络具有更高的变色木识别精度,DeepLabV3网络变色木识别的IoU和F1值分别为0.701和0.812。DeepLabV3+模型在测试数据中具有最高变色木识别精度,特征提取网络ResNet网络深度对变色木识别精度影响较小。DeepLabV3+引入的编码和解码结构能够显著改进DeepLabV3分割精度,同时可获得详细的分割边缘,更有利于松材线虫病变色木识别。

关键词: 无人机遥感; 变色木; 深度学习

Abstract

Pine wilt disease (PWD) is identified as a major disease endangering the forest resources in China. Investigating the deep semantic segmentation network (DSSN)-based unmanned aerial vehicle (UAV) remote sensing identification can improve the identification accuracy of discolored trees infected with PWD and provide technical support for the enhancement and protection of the forest resource quality. Focusing on the pine forest in Laoshan Mountain in Qingdao, this study obtained images of suspected discolored trees through aerial photography using a fixed-wing UAV. To examine four deep semantic segmentation models, namely fully convolutional network (FCN), U-Net, DeepLabV3+, and object context network (OCNet), this study assessed the segmentation accuracies of the four models using recall, precision, IoU, and F1 score. Based on the 2 688 images acquired, 28 800 training samples were obtained through manual labeling and sample amplification. The results indicate that the four models can effectively identify the discolored trees infected with PWD, with no significant false alarms. Furthermore, these deep learning models efficiently distinguished between surface features with similar colors, such as rocks and yellow bare soils. Generally, DeeplabV3+ outperformed the remaining three models, with an IoU of 0.711 and an F1 score of 0.711. In contrast, the FCN model exhibited the lowest segmentation accuracy, with an IoU of 0.699 and an F1 score of 0.812. DeeplabV3+ proved the least time-consuming time for training, requiring merely 27.2 ms per image. Meanwhile, FCN was the least time-consuming in prediction, with only 7.2 ms needed per image. However, this model exhibited the lowest edge segmentation accuracy of discolored trees. Three DeepLabV3+ models constructed using Resnet50, Resnet101, and Resnet152 as front-end feature extraction networks exhibited IoU of 0.711, 0.702, and 0.702 and F1 scores of 0.829, 0.822, and 0.820, respectively. DeepLabV3+ surpassed DeepLabV3 in the identification accuracy of discolored trees, with the letter showing an IoU of 0.701 and an F1 score of 0.812. The train data revealed that DeepLabV3+ exhibited the highest identification accuracy of the discolored trees, while the ResNet feature extraction network produced minor impacts on the identification accuracy. The encoding and decoding structures introduced by DeepLabV3+ can significantly improve the segmentation accuracy of DeepLabV3, yielding more detailed edges. Therefore, DeepLabV3+ is more favorable for the identification of discolored trees infected with PWD.

Keywords: UAV remote sensing; discolored tree; deep learning

PDF (3819KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

张瑞瑞, 夏浪, 陈立平, 丁晨琛, 郑爱春, 胡新苗, 伊铜川, 陈梅香, 陈天恩. 深度语义分割网络无人机遥感松材线虫病变色木识别[J]. 自然资源遥感, 2024, 36(3): 216-224 doi:10.6046/zrzyyg.2023094

ZHANG Ruirui, XIA Lang, CHEN Liping, DING Chenchen, ZHENG Aichun, HU Xinmiao, YI Tongchuan, CHEN Meixiang, CHEN Tianen. Identifying discolored trees inflected with pine wilt disease using DSSN-based UAV remote sensing[J]. Remote Sensing for Land & Resources, 2024, 36(3): 216-224 doi:10.6046/zrzyyg.2023094

0 引言

松材线虫病是由松材线虫引起一种松树绝症,由于尚无有效的治疗手段,染病松树会在1~3个月内枯死,造成大面积的松林毁坏,破坏生态环境并导致大量林业经济效益损失[1-3]。截至2020年,松材线虫病已在我国18个省680个县级行政区发生,累计致死松树超过6亿株,危害区域由我国南方已扩散至辽宁大连,直接威胁我国近9亿亩(❶1亩≈666.67 m2)松林资源安全[4-5]

松材线虫对松树致病机理尚不明确,暂无有效救治手段,因此,对受害疫木的快速识别与砍伐消除是防止疫情扩散的关键[1]。人工地面调查是一种有效的疫木调查方法,但人工调查耗时耗力,受地势影响显著,在地势险要区域人员难以开展调查工作。高分辨率卫星遥感数据能够提供区域松林受为害状况[6],但受制于较低的空间分辨率,难以识别单株受害疫木[7]

当前基于无人机高分辨率遥感影像并结合计算机自动识别算法是松材线虫病疑似变色木(以下简称松材线虫病变色木)快速识别的主要手段,也是主要的研究方向之一。例如,吴琼[8]基于无人机遥感获取的数据源,针对颜色和纹理特征开展松材线虫病变色木识别研究; 曾全等[9]通过无人机飞行高度对变色松树识别效率进行评估,并采用归一化植被指数监测变色松树,取得85.7%的精度; Iordache等[10]基于机载光谱成像设备获取松树的高光谱数据,采用随机森林算法识别染病松树; Syifa等[11]使用无人机采集真彩色图像,通过人工神经网络和支持向量机识别松材线虫病变色木,模型准确率分别为86.59%和79.33%。

基于深度卷积人工神经网络的深度学习技术在图像分割和目标检测领域取得巨大突破,显著提升视觉、图像领域识别精度[12-13]。目前基于深度学习技术的目标检测和语义分割模型已用于松材线虫病识别,例如,徐信罗等[3]使用Faster R-CNN模型开展变色木识别与定位研究; 金远航等[14]基于YOLOv4网络和改进的前端特征提取网络构成新的网络模型,相比于YOLOv4-tiny,YOLOv4和SSD算法,精度分别提升9.58%,12.57%和10.54%,能够较好地实现对于枯死树木的检测。目标检测模型较语义分割模型运行效率高,但目标检测模型仅提供变色木的大致空间位置,无法基于此信息进一步的获取变色木的详细形态信息,例如变色木大小、形态等。上述形态信息对于后续评估染病松树的枯死阶段,确定受感染的松树冠层大小等至关重要。语义分割模型可提供逐像素的变色木分割结果,能够基于分割结果进一步提取变色木的详细形态信息,但当前仅有少量研究使用语义分割模型开展变色木分割研究[7],主流的语义分割模型用于松材线虫病变色木识别精度尚待研究。因此,本研究获取区域的无人机遥感影像,人工目视解译标注训练样本,研究当前主流的深度语义分割网络对松材线虫病变色木识别精度,期望有助于当前松材线虫病变色木防控工作的实施。

1 研究区概况与数据源

1.1 研究区概况

研究区域位于青岛市即墨区西南区域(A1)与青岛市崂山区北部(A2),如图1所示。研究区西部是青岛市城市建成区,东部临海,平均海拔360 m,最高海拔1 132.7 m,年平均温度14.2~15.0 ℃,年均降水约650 mm,植被覆盖以常绿针叶林为主,主导树种为黑松和赤松。采集变色木影像试验区域位于即墨区西南区域(A1)和崂山区北部(A2),其中A1区域面积25.26 km2,A2区域面积33.97 km2

图1

图1   研究区地理位置概况

Fig.1   Location of the study area


1.2 数据源及样本获取

航拍使用DB-II固定翼型无人机,相机设备采用Sony Alpha 7R II相机,采集影像尺寸为7 952像素×4 472像素。航拍分2个架次完成,航线设计如图1所示,A1区域航拍时间为2018年10月9日,A2区域航拍时间为2018年10月6日。无人机飞行高度700 m,飞行速度100 km/h,飞行方向重叠度不低于75 %,旁向重叠度不低于50 %,采集影像地面分辨率7 cm。飞行作业期间,天气状况良好,晴空、微风、无云雾干扰,航拍飞行获得A1区域1 241张影像,A2区域1 447张影像。

航拍飞行设置较大重叠率,获取的无人机影像间重叠大,并且大部分影像不含有变色木,为此本研究选择航拍区域变色木较多影像开展目视解译。具体选择36幅无人机影像,使用ENVI软件感兴趣区域工具完成对变色木标记,如图2(a)(b)所示。本文生成训练样本的流程如下。首先归一化处理无人机图像值至0~1,然后分别在尺寸为7 952像素×4 472像素无人机图像中随机截取200张尺寸为256像素×256像素的图像,每张截取图像均包含变色木像元。最后将无人机图像旋转5°,10°和15°,并对每个旋转图像随机剪切获得200个尺寸为256像素×256像素的图像。完成操作后获得训练样本28 800个,其中60%样本用于模型训练,20%用于模型验证,20%用于模型精度测试,部分样本数据如图2(c)(l)所示。

图2

图2   无人机影像及模型训练样本示例

Fig.2   UAV image and training samples


2 研究方法

2.1 深度语义分割模型

基于深度卷积神经网络的深度语义分割网络自2015年以来发展迅速,已形成一系列优秀的分割模型,如全卷积网络(fully convolutional networks,FCN)[15],PSPNet[16],SegNet[17],U-Net[18],Dense-ASPP[19]和DANet[20]等。根据模型设计的理念,可大致划分为4类: 语义分割模型的里程碑模型、对称结构的代表性模型、空洞卷积代表性模型和基于注意力机制的模型。具体的,本研究探究现阶段具有代表性的4种语义分割模型: 语义分割模型的里程碑模型FCN[15]、对称结构的代表性模型U-Net[18]、空洞卷积代表性模型DeepLabV3+[21]和基于注意力机制的OCNet[22]模型。

FCN是深度语义分割网络中里程碑分割模型,首次发表于2015年计算机视觉和模式识别大会。FCN在网络深度结构与深度分类网络VGG-Net[23]类似,其特点在于将分类模型的全连接层转换为卷积层,因此FCN 是端到端、像素到像素的分割网络。FCN网络池化层pool5的输出通过上采样与另一池化层融合获得更精细的特征图。FCN网络通过融合不同分辨率的特征层生成不同的分辨率网络,例如 FCN-32s,FCN-16s和FCN-8s。在本研究选择具有最精细特征细节的FCN-8s 作为测试模型。

U-Net 是一种广泛使用的图像分割深度学习模型,最初在2015年国际医学图像计算和计算机辅助干预(medical image computing and computer-assisted intervention,MICCAA)会议发表,用于生物医学图像分割[18]。U-Net网络由下采样(编码)和上采样(解码)2个部分构成,形成“U”型架构。下采样用于提取图像特征,上采样用于恢复下采样学习获取的特征细节。具体的,U-Net网络下采样4次,累计降低图像分辨率16倍,对称地上采样4次,恢复下采样生成的高级语义特征图至输入图片分辨率。

DeepLab是Google公司提出的一系列代表性深度语义分割模型。DeepLab运用空洞卷积提升模型的感受视野,在不显著降低数据分辨率条件下获得底层抽象语义特征[21-24]。DeepLabv3模型[24]实现了增强的空洞空间金字塔池化(atrous spatial pyramid pooling, ASPP)模块以检测多个尺度的卷积特征并获得对全局上下文编码特征。DeepLabV3+ 模型基于ASPP 模块,增加了编码器-解码器架构,提高模型性能,可获得更为精细的分割边界。

OCNet使用自注意力机制捕获上下文语义信息,将局部特征与全局融合得到准确的多尺度特征[22]。具体的,OCNet在FCN模型架构基础上,空间和通道维度上使用2种类型的注意力模块捕获和整合不同尺度的上下文语义信息流,构建更为稳定的语义特征。本研究所用模型的前端特征提取网络与详细信息如表1所示。

表1   模型参数表

Tab.1  Parameters of deep learning models

模型来源参数数量前端特征
FCNLong等, 2015[15]15 305 667Xception[25]里程碑分割网络
U-NetRonneberger等, 2015[18]26 355 169ResNet[26]对称分割网络
DeepLabV3+Chen等, 2018[21]74 982 817ResNet空洞卷积、ASPP
OCNetYuan等, 2018[22]36 040 105ResNet注意力机制

新窗口打开| 下载CSV


2.2 损失函数

损失函数计算模型的输出和真值之间的差异,用于衡量模型性能,选择合理的损失函数对于获得准确的结果至关重要。当前几种损失函数在深度学习模型中广泛使用,例如,交叉熵和Dice损失[27]可用于样本分布均衡的数据集,Focal Loss用于样本不平衡的数据集[28]。在本研究中,变色木和其他地物的数量差异较大,为正确的训练模型,选择Focal Loss作为损失函数,公式为:

FocalL(p)=-α(1-p)γlog2(p)

式中: FocalL(p)为损失值; αγ为权重因子,α[0,1],γ[0,5]; p为模型预测值。

2.3 模型训练

用于模型训练的工作站配置为: Intel Xeon E5-2630处理器,64 GB内存,1 TB磁盘,NVIDIA P100 16 GB显卡。表1中所示模型基于Pytorch1.6构建,Adam优化器学习率0.001,训练100次,块(batch)大小16。每个模型单独训练3次,保存验证数据集对应loss值最小的模型参数为最优模型参数。

2.4 精度评估

本研究采用召回率(Recall)、精确率(Precision)、F1值和交并比(intersection over union,IoU)评估各模型分割精度,计算公式分别为:

Recall=TPTP+FN
Precision=TPTP+FP
F1=2Precision×RecallPrecision+Recall
IoU=ABAB=ABA+B-AB

式中: TP为正样本被正确识别为正样本的数量; FP为负样本被错误识别为正样本的数量; FN为正样本被错误识别为负样本的数量; F1值为RecallPrecision调和值; IoU为预测值A与真实值B之间的交并比,因此其能够指示模型对变色木的识别精度。IoU值越高表明分割精度越高,基于此结果提取的变色木形态信息精度越高。

3 结果与分析

图3所示是各模型训练损失值和F1值,从图中可知随着迭代次数增加,模型逐渐收敛,损失值趋近于0.1,F1值接近0.9,表明各模型能够正确地完成训练。

图3

图3   模型训练损失值和F1值

Fig.3   Training loss and F1-score of each model


使用训练后的模型,分别预测待测试无人机影像,得到变色木图像分割结果。表2是选取了不同光照条件、具有代表性背景地物图像和对应分割结果。表2中5幅真彩色图像中均含有变色木,具体的变色木掩码如表中“地面真值”所示。a由健康松树与变色木构成,b由健康松树、变色木与颜色类似裸土构成,c由绿色植被、岩石与变色木构成、d构成与c类似,e由绿色植被、变色木裸土与建筑物构成,该图像在光照不充足条件下获得。

表2   模型分割结果

Tab.2  Segmentation results of models

序号无人机影像地面真值FCN结果U-Net结果DeepLabV3+结果OC-Net结果
a
b
c
d
e

新窗口打开| 下载CSV


分析表2可知,上述4种网络均能够较好地对松材线虫病变色木进行分割,无明显误报,表明深度语义模型对颜色相近的地物,如岩石、黄色裸土有较强的分辨力。具体对比不同分割网络对变色木分割结果可知,FCN网络具有最粗糙的分割边缘,这是由于FCN网络对低分辨率、高等级特征与高分辨率、低等级特征融合较为简单,导致最终获得的特征图像分辨率较低。相比FCN网络,U-Net和DeepLabV3+改进了不同深度特征融合方式,具有更为精细的分割边缘结果。

不同分割模型的识别总体精度均高于0.9,具体精度如表3所示。从表中可知,DeepLabV3+具有最高的分割精度,IoUF1值分别为0.711和0.829,FCN模型分割精度最低,IoUF1值分别为0.699和0.812。U-Net网络具有较高精度,IoUF1值均高于OCNet与FCN模型。因此,基于空洞卷积、编码和解码结构的DeepLabV3+模型具有最高的变色木分割精度,能够获得更为精确的变色木分割边缘。从表中还可知4种模型的训练与预测性能,DeepLabV3+具有最低的训练耗时,在结果预测上,FCN具有优势,DeepLabV3+耗时低于U-Net和OCNet。

表3   模型精度与性能

Tab.3  Accuracies and time usages of models

模型IoUF1PrecisionRecall训练/(ms·幅-1)预测/(ms·幅-1)
FCN0.6990.8120.8210.80453.17.2
U-Net0.7100.8250.8210.82844.520.5
DeepLabV3+0.7110.8290.8260.83327.214.8
OCNet0.7060.8200.8240.81747.616.9

新窗口打开| 下载CSV


4 讨论

4.1 特征提取网络的深度对变色木识别精度影响

现阶段主流的深度语义分割模型结构多由前端和后端2部分构成,即特征提取网络和后端语义分割结构2个部分。特征提取网络提供模型开展语义分割所需的特征流,语义分割结构则处理后续对象分割,因此前端(特征提取网络)的性能对最终的分割结果具有显著影响[7]。例如本研究中变色木分割精度最高的DeepLabV3+模型使用ResNet网络[26]作为前端,而ResNet根据不同配置可以生成不同的网络深度,例如ResNet50, ResNet101, ResNet152等。为此,本研究使用不同深度的ResNet作为前端,比较分析在变色木分割中,最适用于DeepLabV3+模型的ResNet特征提取网络。具体的,分别使用ResNet50, ResNet101, ResNet152这3种网络作为前端,对模型进行训练和预测,表4是预测结果示例。

表4   基于不同ResNet的DeepLabV3+模型分割结果

Tab.4  Segmentation results of DeepLabV3+ with different backbones

序号无人机影像地面真值DeepLabV3+
ResNet50
分割结果
DeepLabV3+
ResNet101
分割结果
DeepLabV3+
ResNet152
分割结果
DeepLabV3
ResNet50
分割结果
a
b
c
d
e

新窗口打开| 下载CSV


从表中示例结果可知,随着ResNet网络深度的增加,DeepLabV3+网络变色木分割精度并未显著提升。例如影像e所对应的分割结果中,随着ResNet网络深度的增加,土壤与变色木分割结果未改善,且基于ResNet152的DeepLabV3+网络对于人造建筑物的分割结果也未改善(影像e中右下角红色建筑物)。具体的,不同ResNet网络深度对应的DeepLabV3+网络分割结果的测试精度如表5所示。整体上,基于ResNet50的DeepLabV3+网络具有最高的变色木分割精度,IoUF1分别是0.711和0.829。

表5   模型精度

Tab.5  Accuracy of models

模型IoUF1PrecisionRecall
DeepLabV3+
ResNet50
0.7110.8290.8260.833
DeepLabV3+
ResNet101
0.7020.8220.8470.798
DeepLabV3+
ResNet152
0.7020.8200.8460.796
DeepLabv3
ResNet50
0.7010.8120.8130.811

新窗口打开| 下载CSV


4.2 DeepLabV3系列模型编解码结构对变色木识别精度影响

在相同的特征提取网络条件下,语义分割网络的后端结构直接决定分割结果的精度以及分割边缘的精细程度。DeepLabV3网络后端设计依赖于ASPP模块,DeepLabV3+模块则在DeepLabV3网络ASPP模块基础上添加类似FCN和U-Net等模型的编码和解码结构,改进分割边缘的精度,获得更为详细的分割边缘。为此本研究对比DeepLabV3和DeepLabV3+网络,分析DeepLabV3+网络中编码和解码结构对变色木分割结果的贡献。具体的,DeepLabV3和DeepLabV3+网络均使用ResNet50作为前端,训练模型并进行预测和精度验证,如表4中结果示例。由表4中可知,在相同的前端特征提取网络下,DeepLabV3+能够获得更为精细的变色木分割边缘,同时变色木分割精度也高于DeepLabV3(见表5)。这表明DeepLabV3+引入的编码和解码结构能够显著地改进DeepLabV3模型的分割精度,同时能够获得详细的分割边缘,有利于松材线虫病变色木识别。

4.3 疑似变色木识别结果受其他因素干扰

当前基于可见光无人机影像数据对疑似松材线虫变色木开展检测的原理在于松树受松材线虫菌侵染后发生枯萎,表观颜色发生明显变化。由于松树受其他病虫害危害或面临环境胁迫也存在类似的颜色变化特征,为此当前基于可见光影像数据开展变色木识别获得的结果仅为疑似变色木。对于已经明确为松材线虫病疫区的林场,基于可见光影像识别的疑似变色木由松材线虫病致病的概率高,在实际工作中一般可认为是松材线虫病变色木,对于其他地区则需要配合实验室镜检或者基因检测等手段进一步的确认。

5 结论

本研究采用航拍获取大区域无人机松材线虫病疑似变色木影像,通过手动解译绘制训练样本,研究当前主流语义分割网络对变色木识别性能。研究结果表明在不同的光照、地物颜色特征相近的复杂背景条件下,当前主流的语义分割网络均可以较好地识别变色木,FCN,U-Net,DeepLabV3+和OCNet网络的IoU值分别达到0.699,0.710,0.711,0.706。在识别边缘精度上,FCN网络由于不同分辨率特征混合方式单一,分割边缘精度最低,U-Net和DeepLabV3+保持高分辨率特征、改进不同分辨特征混合方式,具有更为精细的分割边缘。

DeepLabV3+与DeepLabV3对比分析表明,DeepLabV3+引入的编码解码结构显著地改进了变色木分割边缘的精度,有利于获得更为详细的变色木形态信息,基于ResNet50的DeepLabV3+网络具有最高的变色木分割精度,IoU与F1分别达到0.711和0.829,前端特征提取网络ResNet的网络深度,如ResNet50,ResNet101和ResNet152,对DeepLabV3+变色木分割精度并无显著影响。

参考文献

Proença D N, Grass G, Morais P V.

Understanding pine wilt disease:Roles of the pine endophytic bacteria and of the bacteria carried by the disease-causing pinewood nematode

[J]. MicrobiologyOpen, 2017, 6(2):e00415.

[本文引用: 2]

张瑞瑞, 夏浪, 陈立平, .

基于U-Net网络和无人机影像的松材线虫病变色木识别

[J]. 农业工程学报, 2020, 36(12):61-68.

[本文引用: 1]

Zhang R R, Xia L, Chen L P, et al.

Recognition of wilt wood caused by pine wilt nematode based on U-Net network and unmanned aerial vehicle images

[J]. Transactions of the Chinese Society of Agricultural Engineering, 2020, 36(12):61-68.

[本文引用: 1]

徐信罗, 陶欢, 李存军, .

基于Faster R-CNN的松材线虫病受害木识别与定位

[J]. 农业机械学报, 2020, 51(7):228-236.

[本文引用: 2]

Xu X L, Tao H, Li C J, et al.

Detection and location of pine wilt disease induced dead pine trees based on faster R-CNN

[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(7):228-236.

[本文引用: 2]

叶建仁.

松材线虫病在中国的流行现状、防治技术与对策分析

[J]. 林业科学, 2019, 55(9):1-10.

[本文引用: 1]

Ye J R.

Epidemic status of pine wilt disease in China and its prevention and control techniques and counter measures

[J]. Scientia Silvae Sinicae, 2019, 55(9):1-10.

[本文引用: 1]

国家林业和草原局. 国家林业和草原局公告(2020 年第 4 号)(2020年松材线虫病疫区)[EB/OL]. [2020-03-16]. http://www.forestry.gov.cn/main/3457/20200326/145712092854308.html.

URL     [本文引用: 1]

State Forestry and Grassland Administration. Announcement of the National Forestry and Grassland Administration (No.4 of 2020) (Pine wood nematode disease epidemic area in 2020) [EB/OL]. [2020-03-16]. http://www.forestry.gov.cn/main/3457/20200326/145712092854308.html.

URL     [本文引用: 1]

许青云, 李莹, 谭靖, .

基于高分六号卫星数据的红树林提取方法

[J]. 自然资源遥感, 2023, 35(1):41-48.doi:10.6046/zrzyyg.2022048.

[本文引用: 1]

Xu Q Y, Li Y, Tan J, et al.

Information extraction method of mangrove forests based on GF-6 data

[J]. Remote Sensing for Natural Resources, 2023, 35(1):41-48.doi:10.6046/zrzyyg.2022048.

[本文引用: 1]

Xia L, Zhang R, Chen L, et al.

Evaluation of deep learning segmentation models for detection of pine wilt disease in unmanned aerial vehicle images

[J]. Remote Sensing, 2021, 13(18):3594.

[本文引用: 3]

吴琼. 基于遥感图像的松材线虫病区域检测算法研究[D]. 合肥: 安徽大学, 2013.

[本文引用: 1]

Wu Q. Research on Bursaphelenchus xylophilus area detection based on remote sensing image[D]. Hefei: Anhui University, 2013.

[本文引用: 1]

曾全, 孙华富, 杨远亮, .

无人机监测松材线虫病的精度比较

[J]. 四川林业科技, 2019, 40(3):92-95,114.

[本文引用: 1]

Zeng Q, Sun H F, Yang Y L, et al.

Precision comparison for pine wood nematode disease monitoring by UAV

[J]. Journal of Sichuan Forestry Science and Technology, 2019, 40(3):92-95,114.

[本文引用: 1]

Iordache M D, Mantas V, Baltazar E, et al.

A machine learning approach to detecting pine wilt disease using airborne spectral imagery

[J]. Remote Sensing, 2020, 12(14):2280.

[本文引用: 1]

Syifa M, Park S J, Lee C W.

Detection of the pine wilt disease tree candidates for drone remote sensing using artificial intelligence techniques

[J]. Engineering, 2020, 6(8):919-926.

[本文引用: 1]

Xia L, Zhao F, Chen J, et al.

A full resolution deep learning network for paddy rice mapping using Landsat data

[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 194:91-107.

[本文引用: 1]

胡建文, 汪泽平, 胡佩.

基于深度学习的空谱遥感图像融合综述

[J]. 自然资源遥感, 2023, 35(1):1-14.doi:10.6046/zrzyyg.2021433.

[本文引用: 1]

Hu J W, Wang Z P, Hu P.

A review of pansharpening methods based on deep learning

[J]. Remote Sensing for Natural Resources, 2023, 35(1):1-14.doi:10.6046/zrzyyg.2021433.

[本文引用: 1]

金远航, 徐茂林, 郑佳媛.

基于改进YOLOv4-tiny的无人机影像枯死树木检测算法

[J]. 自然资源遥感, 2023, 35(1):90-98.doi:10.6046/zrzyyg.2022018.

[本文引用: 1]

Jin Y H, Xu M L, Zheng J Y.

A dead tree detection algorithm based on improved YOLOv4-tiny for UAV images

[J]. Remote Sensing for Natural Resources, 2023, 35(1):90-98.doi:10.6046/zrzyyg.2022018.

[本文引用: 1]

Shelhamer E, Long J, Darrell T.

Fully convolutional networks for semantic segmentation

[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.

DOI:10.1109/TPAMI.2016.2572683      PMID:27244717      [本文引用: 3]

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional networks achieve improved segmentation of PASCAL VOC (30% relative improvement to 67.2% mean IU on 2012), NYUDv2, SIFT Flow, and PASCAL-Context, while inference takes one tenth of a second for a typical image.

Zhao H, Shi J, Qi X, et al.

Pyramid scene parsing network

[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu,HI,USA.IEEE, 2017:6230-6239.

[本文引用: 1]

Badrinarayanan V, Kendall A, Cipolla R.

SegNet:A deep convolutional encoder-decoder architecture for image segmentation

[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.

DOI:10.1109/TPAMI.2016.2644615      PMID:28060704      [本文引用: 1]

We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3], DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet.

Ronneberger O, Fischer P, Brox T.

U-net:Convolutional networks for biomedical image segmentation

[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2015:234-241.

[本文引用: 4]

Yang M, Yu K, Zhang C, et al.

DenseASPP for semantic segmentation in street scenes

[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA.IEEE, 2018:3684-3692.

[本文引用: 1]

Fu J, Liu J, Tian H, et al.

Dual attention network for scene segmentation

[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach,CA,USA.IEEE, 2019:3141-3149.

[本文引用: 1]

Chen L C, Zhu Y, Papandreou G, et al.

Encoder-decoder with atrous separable convolution for semantic image segmentation

[C]// Proceedings of the European Conference on Computer Vision (ECCV),Munich.ACM, 2018:833-851.

[本文引用: 3]

Yuan Y, Huang L, Guo J, et al. OCNet:Object context network for scene parsing[EB/OL]. 2018:arXiv:1809.00916. https://arxiv.org/abs/1809.00916.pdf.

URL     [本文引用: 4]

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL].arXiv. https://arxiv.org/abs/1409.1556.pdf.

URL     [本文引用: 2]

Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL].arXiv. https://arxiv.org/abs/1706.05587.pdf.

URL     [本文引用: 2]

Chollet F.

Xception:Deep learning with depthwise separable convolutions

[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu,HI,USA.IEEE, 2017:1800-1807.

[本文引用: 1]

He K, Zhang X, Ren S, et al.

Deep residual learning for image recognition

[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas,NV,USA.IEEE, 2016:770-778.

[本文引用: 2]

Huang Q, Sun J, Ding H, et al.

Robust liver vessel extraction using 3D U-Net with variant dice loss function

[J]. Computers in Biology and Medicine, 2018, 101:153-162.

DOI:S0010-4825(18)30238-5      PMID:30144657      [本文引用: 1]

Liver vessel extraction from CT images is essential in liver surgical planning. Liver vessel segmentation is difficult due to the complex vessel structures, and even expert manual annotations contain unlabeled vessels. This paper presents an automatic liver vessel extraction method using deep convolutional network and studies the impact of incomplete data annotation on segmentation accuracy evaluation.We select the 3D U-Net and use data augmentation for accurate liver vessel extraction with few training samples and incomplete labeling. To deal with high imbalance between foreground (liver vessel) and background (liver) classes but also increase segmentation accuracy, a loss function based on a variant of the dice coefficient is proposed to increase the penalties for misclassified voxels. We include unlabeled liver vessels extracted by our method in the expert manual annotations, with a specialist's visual inspection for refinement, and compare the evaluations before and after the procedure.Experiments were performed on the public datasets Sliver07 and 3Dircadb as well as local clinical datasets. The average dice and sensitivity for the 3Dircadb dataset were 67.5% and 74.3%, respectively, prior to annotation refinement, as compared with 75.3% and 76.7% after refinement.The proposed method is automatic, accurate and robust for liver vessel extraction with high noise and varied vessel structures. It can be used for liver surgery planning and rough annotation of new datasets. The evaluation difference based on some benchmarks, and their refined results, showed that the quality of annotation should be further considered for supervised learning methods.Copyright © 2018 Elsevier Ltd. All rights reserved.

Lin T Y, Goyal P, Girshick R, et al.

Focal loss for dense object detection

[C]// 2017 IEEE International Conference on Computer Vision (ICCV).Venice,Italy.IEEE, 2017:2999-3007.

[本文引用: 1]

/

京ICP备05055290号-2
版权所有 © 2015 《自然资源遥感》编辑部
地址:北京学院路31号中国国土资源航空物探遥感中心 邮编:100083
电话:010-62060291/62060292 E-mail:zrzyyg@163.com
本系统由北京玛格泰克科技发展有限公司设计开发