模型集群分析策略联合ELM的土壤重金属Pb含量预测

肖烨辉; 宋妮迪; 孟盼盼; 王培俊; 范胜龙

doi:10.6046/zrzyyg.2020378

模型集群分析策略联合ELM的土壤重金属Pb含量预测

Prediction of lead content in soil based on model population analysis coupled with ELM algorithm

摘要

摘要: 为探寻区域土壤重金属含量最佳反演模型,以龙海市为研究区,对土壤原始光谱数据分别进行SG平滑、小波变换、高斯滤波和多元散射校正4种光谱预处理,运用基于模型集群分析(model population analysis,MPA)策略开发的波长选择算法: 竞争适应性重加权采样算法(competitive adaptive reweighted sampling,CARS)、变量空间迭代收缩算法(variable iterative space shrinkage approach,VISSA)、迭代变量子集优化算法(iteratively variable subset optimization,IVSO)和区间组合优化算法(interval combination optimization,ICO)剔除干扰与无信息波长变量,采用线性模型偏最小二乘回归(partial least squares regression,PLSR)、非线性模型支持向量机(support vector machine,SVM)及神经网络模型极限学习机(extreme learning machine,ELM)进行土壤重金属铅(Pb)含量回归预测。结果表明: 经过多种预处理方法建立的Pb含量反演模型中,基于小波变换第七层重构后的光谱数据构建的模型预测精度最优,其验证集R²=0.736,RMSE=5.426,RPD=1.976,RPIQ=2.560。基于MPA策略开发的CARS,VISSA,IVSO和ICO都能显著提升模型解释性与泛化性能,并且提高建模效率。3种回归模型总体的预测表现排序: ELM>PLSR>SVM。其中ICO-ELM预测精度最高,其验证集R²=0.863,RMSE=3.953,RPD=2.712,RPIQ=3.514。所建最优模型可为区域土地质量和生态指标快速准确监测提供新的理论参考。

Abstract: This paper aims to explore the optimal inversion model of regional heavy metal content in soil. With Longhai City taken as the study area, this study preprocessed the original spectral data of soil using the methods of Savizky Golay (SG), wavelet transform (WT), gaussian filter (GF), and multiple scatter correction (MSC) individually, then eliminated the interference and wavelength bearing no information using the wavelength selection algorithms developed based on model population analysis (MPA), including the competitive adaptive reweighted sampling (CARS), variable iterative space shrinkage approach (VISSA), iteratively variable subset optimization (IVSO), and interval combination optimization (ICO), and finally predicted the lead content in soil using the linear partial least squares regression (PLSR) model, nonlinear support vector machine (SVM) model, and extreme learning machine (ELM) based on neural network. The results are as follows. ① Among the inversion models of lead content in soil established using various preprocessing methods, the model built based on reconstructed spectral data of level 7th by wavelet transform had the most optimal prediction accuracy, with R²=0.736, RMSE=5.426, RPD=1.976, and RPIQ=2.560. ② The CARS, VISSA, IVSO, and ICO algorithms developed based on MPA significantly improved the performance of model interpretation and generalization and improved modeling efficiency. ③ In terms of overall prediction results, the three regression models were in the order of ELM>PLSR>SVM. Among them, the ICO-ELM had the highest prediction accuracy, with R²=0.863, RMSE=3.953, RPD=2.712,and RPIQ=3.514. Therefore, the optimal model established in this study can provide a new theoretical reference for the rapid monitoring of regional land quality and ecological indicators.

HTML全文

参考文献(34)

施引文献

资源附件(0)