|
|
A method for identifying lithology based on a feature-weighted KNN model |
GUO Yu-Shan1( ), WANG Wan-Yin1,2,3( ) |
1. School of Geological Engineering and Geomatics, Chang'an University, Xi'an 710054, China 2. Key Laboratory of Marine Geology & Environment, Qingdao 266071, China 3. National Engineering Research Center of Offshore Oil and Gas Exploration, Beijing 100028, China |
|
|
Abstract Lithology identification, as a major geological task, strongly underpins the exploration of solid minerals, oil, and gas. Since the physical properties of rocks bridge lithologies and geophysical fields, their differences can be used for lithology identification. However, the physical property data of different rocks frequently overlap to some extent, posing challenges to accurate lithology identification using cross plots alone. The K-nearest neighbor (KNN) model is suitable for multi-class classification since it is a simple and direct machine learning method with high accuracy and sensitivity. This study introduced a feature-weighted KNN model for lithology identification. In this model, different weights were assigned to different features by combining the conventional KNN model with the information gain of attribute features. This allowed for intuitive reflection of the importance of attribute features to classification. Experiments show that compared to the conventional KNN model, the feature-weighted KNN model can more significantly identify lithologic boundaries, thus improving the overall accuracy and stability of lithology identification.
|
Received: 15 June 2023
Published: 16 April 2024
|
|
Corresponding Authors:
WANG Wan-Yin
E-mail: gys103319@163.com;wwy7902@chd.edu.cn
|
|
|
|
|
Rock distribution of the model
|
岩石名称 | 密度/ (kg·m-3) | 磁化率/ (10-5 SI) | 电阻率/ (Ω·m) | 板岩 | 2630~2850 | 0~160 | 3~8 | 片麻岩 | 2570~2830 | 180~280 | 40~60 | 花岗岩 | 2580~2640 | 0~160 | 10~30 |
|
Physical property parameter of modal
|
|
Physical properties distribution of the model a—density distribution of the model;b—magnetic susceptibility distribution of the model;c—resistivity distribution of the model
|
|
Interaction diagram of physical property parameters of 250 training samples
|
|
Interaction diagram of physical property parameters of 435 training samples
|
|
Interaction diagram of physical property parameters of 870 training samples
|
训练样本/个 | 平均准确率/% | 方差/% | RMSE/% | 250 | 90.22 | 0.42 | 11.68 | 435 | 90.81 | 0.52 | 11.58 | 870 | 90.09 | 0.26 | 11.09 |
|
Accuracy rate and error statistical table of modal
|
训练样本/个 | 最佳K值 | 准确率/% | 250 | 4 | 91.41 | 435 | 20 | 93.95 | 870 | 21 | 92.51 |
|
Optimal K value and accuracy of the model
|
样本个数/个 | 特征 | 信息增益值 | 权重值 | 250 | 密度 | 0.8311 | 0.3145 | 磁化率 | 0.6502 | 0.2461 | 电阻率 | 1.1610 | 0.4394 | 435 | 密度 | 0.6282 | 0.2628 | 磁化率 | 0.7167 | 0.2998 | 电阻率 | 1.0459 | 0.4374 | 870 | 密度 | 0.7041 | 0.2754 | 磁化率 | 0.7096 | 0.2776 | 电阻率 | 1.1425 | 0.4470 |
|
Statistics table of feature information gain and weight
|
训练样本/个 | 平均准确率/% | 方差/% | RMSE/% | 250 | 93.68 | 0.20 | 7.70 | 435 | 93.62 | 0.20 | 7.86 | 870 | 93.76 | 0.35 | 8.52 |
|
Accuracy rate and error statistical table of feature weighted modal
|
|
Accuracy line graph of two models under different K value
|
训练样本/个 | 最佳K值 | 错误率/% | 传统KNN方法 | 基于特征加 权的KNN方法 | 250 | 4 | 8.59 | 4.17 | 435 | 20 | 6.05 | 4.78 | 870 | 21 | 7.49 | 4.25 |
|
Comparison of model error rates
|
|
The recognition result of the model under the optimal K value Figures(a) ~ (c) is the recognition results of training samples 250, 435 and 870 under the traditional KNN model. Figures (d) ~ (f) is the recognition junction of 250, 435 and 870 training samples based on the feature-weighted KNN model
|
[24] |
Feng G H, Wu J X. A literature review on the improvement of KNN algorithm[J]. Library and Information Service, 2012, 56(21):97-100,118.
|
[25] |
Stone M. Cross-validatory choice and assessment of statistical predictions[J]. Journal of the Royal Statistical Society Series B:Statistical Methodology, 1974, 36(2):111-133.
|
[26] |
孙傲, 赵礼峰. 基于信息增益和基尼不纯度的K近邻算法[J]. 计算机技术与发展, 2019, 29(9):51-54,116.
|
[26] |
Sun A, Zhao L F. K-nearest neighbor algorithm based on information gain and gini impurity[J]. Computer Technology and Development, 2019, 29(9):51-54,116.
|
[27] |
袁彪. 基于机器学习的岩性识别模型研究[D]. 北京: 中国地质大学(北京), 2021
|
[27] |
Yuan B. Research on lithology identification Model based on Machine Learning[D]. Beijing: China University of Geosciences (Beijing), 2021.
|
[28] |
范永东. 模型选择中的交叉验证方法综述[D]. 太原: 山西大学, 2013.
|
[28] |
Fan Y D. Review of cross-validation methods in model selection[D]. Taiyuan: Shanxi University, 2013.
|
[1] |
付光明, 严加永, 张昆, 等. 岩性识别技术现状与进展[J]. 地球物理学进展, 2017, 32(1):26-40.
|
[1] |
Fu G M, Yan J Y, Zhang K, et al. Current status and progress of lithology identification technology[J]. Progress in Geophysics, 2017, 32(1):26-40.
|
[2] |
靳军, 刘楼军, 邵雨, 等. 综合地球物理方法识别准噶尔盆地的岩性圈闭[J]. 石油地球物理勘探, 2002, 37(3):287-290,299-306.
|
[2] |
Jin J, Liu L J, Shao Y, et al. Discussion on identifying method for Identification of lithologic traps in Junggar Basin by comprehensive geophysical method[J]. Oil Geophysical Prospecting, 2002, 37(3):287-290,299-306.
|
[3] |
严加永, 吕庆田, 陈向斌, 等. 基于重磁反演的三维岩性填图试验——以安徽庐枞矿集区为例[J]. 岩石学报, 2014, 30(4):1041-1053.
|
[3] |
Yan J Y, Lyu Q T, Chen X B, et al. 3D lithologic mapping test based on 3D inversion of gravity and magnetic data:A case study in Lu-Zong ore concentration district,Anhui Province[J]. Acta Petrologica Sinica, 2014, 30(4):1041-1053.
|
[4] |
付光明. 基于重磁三维反演的岩性识别与立体填图——以铜陵矿集区为例[D]. 抚州: 东华理工大学, 2017.
|
[4] |
Fu G M. Lithology identification and stereo mapping based on gravity and magnetic 3D inversion-taking Tongling ore concentration area as an example[D]. Fuzhou: East China Institute of Technology, 2017.
|
[5] |
赵建, 高福红. 测井资料交会图法在火山岩岩性识别中的应用[J]. 世界地质, 2003, 22(2):136-140.
|
[5] |
Zhao J, Gao F H. Application of crossplots based on well log data in identifying volcanic lithology[J]. World Geology, 2003, 22(2):136-140.
|
[6] |
吴磊, 徐怀民, 季汉成. 基于交会图和多元统计法的神经网络技术在火山岩识别中的应用[J]. 石油地球物理勘探, 2006, 41(1):81-86,122,128.
|
[6] |
Wu L, Xu H M, Ji H C. Application of neural networks technique based on crosspiot and multielement statistics to recognition of volcanic rocks[J]. Oil Geophysical Prospecting, 2006, 41(1):81-86,122,128.
|
[7] |
张涛, 莫修文. 基于交会图与模糊聚类算法的复杂岩性识别[J]. 吉林大学学报:地球科学版, 2007, 37(S1):109-113.
|
[7] |
Zhang T, Mo X W. Complex lithology identification based on crossplot and fuzzy clustering algorithm[J]. Journal of Jilin University:Earth Science Edition, 2007, 37(S1):109-113.
|
[8] |
关涛. 基于交会图和贝叶斯聚类分析法的岩性识别方法[J]. 科学技术与工程, 2013, 13(4):976-979.
|
[8] |
Guan T. Method of lithologic identification based on crossplot and Bayesian cluster analysis algorithm[J]. Science Technology and Engineering, 2013, 13(4):976-979.
|
[9] |
张晏奇. 测井资料交会图法在火山岩岩性识别中的应用探讨[J]. 西部探矿工程, 2019, 31(4):53-54.
|
[9] |
Zhang Y Q. Discussion on the application of logging data crossplot method in volcanic rock lithology identification[J]. West-China Exploration Engineering, 2019, 31(4):53-54.
|
[10] |
许振浩, 马文, 李术才, 等. 岩性识别:方法、现状及智能化发展趋势[J]. 地质论评, 2022, 68(6):2290-2304.
|
[10] |
Xu Z H, Ma W, Li S C, et al. Lithology identification:Method,research status and intelligent development trend[J]. Geological Review, 2022, 68(6):2290-2304.
|
[11] |
Wang X D, Yang S C, Zhao Y F, et al. Lithology identification using an optimized KNN clustering method based on entropy-weighed cosine distance in Mesozoic strata of Gaoqing field,Jiyang depression[J]. Journal of Petroleum Science and Engineering, 2018, 166:157-174.
|
[12] |
Silva A A, Tavares M W, Carrasquilla A, et al. Petrofacies classification using machine learning algorithms[J]. Geophysics, 2020, 85(4):WA101-WA113.
|
[13] |
蔡泽园, 鲁宝亮, 熊盛青, 等. 基于自适应核密度的贝叶斯概率模型岩性识别方法研究[J]. 物探与化探, 2020, 44(4):919-927.
|
[13] |
Cai Z Y, Lu B L, Xiong S Q, et al. Lithology identification based on Bayesian probability using adaptive kernel density[J]. Geophysical and Geochemical Exploration, 2020, 44(4):919-927.
|
[14] |
牟丹, 张丽春, 徐长玲. 3种经典机器学习算法在火山岩测井岩性识别中的对比[J]. 吉林大学学报:地球科学版, 2021, 51(3):951-956.
|
[14] |
Mou D, Zhang L C, Xu C L. Comparison of three classical machine learning algorithms for lithology identification of volcanic rocks using well logging data[J]. Journal of Jilin University:Earth Science Edition, 2021, 51(3):951-956.
|
[15] |
陈玉林, 李戈理, 杨智新, 等. 基于KNN算法识别合水地区长7储层岩性岩相[J]. 测井技术, 2020, 44(2):182-185.
|
[15] |
Chen Y L, Li G L, Yang Z X, et al. Identification of lithology and lithofacies of Chang 7 reservoir in Heshui area by KNN algorithm[J]. Well Logging Technology, 2020, 44(2):182-185.
|
[16] |
刘述昌, 张忠林. 基于中心向量的多级分类KNN算法研究[J]. 计算机工程与科学, 2017, 39(9):1758-1764.
|
[16] |
Liu S C, Zhang Z L. A multi-stage classification KNN algorithm based on center vector[J]. Computer Engineering & Science, 2017, 39(9):1758-1764.
|
[17] |
Chen Y W, Zhou L D, Tang Y, et al. Fast neighbor search by using revised k-d tree[J]. Information Sciences, 2019, 472:145-162.
|
[18] |
孙可, 龚永红, 邓振云. 一种高效的K值自适应的SA-KNN算法[J]. 计算机工程与科学, 2015, 37(10):1965-1970.
|
[18] |
Sun K, Gong Y H, Deng Z Y. An efficient SA-KNN algorithm with adaptive Kvalue[J]. Computer Engineering & Science, 2015, 37(10):1965-1970.
|
[19] |
肖辉辉, 段艳明. 基于属性值相关距离的KNN算法的改进研究[J]. 计算机科学, 2013, 40(S2):157-159,187.
|
[19] |
Xiao H H, Duan Y M. Research on improvement of KNN algorithm based on correlation distance of attribute values[J]. Computer Science, 2013, 40(S2):157-159,187.
|
[20] |
赵彤彤, 张春雷, 张春雨, 等. 基于模糊熵的KNN分类模型在岩性识别中的应用[J]. 计算机工程与应用, 2018, 54(24):260-265.
|
[20] |
Zhao T T, Zhang C L, Zhang C Y, et al. Application of KNN classification model based on fuzzy entropy in lithology recognition[J]. Computer Engineering and Applications, 2018, 54(24):260-265.
|
[21] |
王林, 王禹杰. Entropy-KNN算法在岩性识别中的应用研究[J]. 安徽建筑, 2021, 28(5):95-97.
|
[21] |
Wang L, Wang Y J. Application of entropy-KNN algorithm in lithology identification[J]. Anhui Architecture, 2021, 28(5):95-97.
|
[22] |
朱浩, 曹宁, 鹿浩, 等. 基于特征加权KNN的非侵入式负荷识别方法[J]. 电子测量技术, 2022, 45(8):70-75.
|
[22] |
Zhu H, Cao N, Lu H, et al. Non-intrusive load identification method based on feature weighted KNN[J]. Electronic Measurement Technology, 2022, 45(8):70-75.
|
[23] |
Cover T, Hart P. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13(1):21-27.
|
[24] |
奉国和, 吴敬学. KNN分类算法改进研究进展[J]. 图书情报工作, 2012, 56(21):97-100,118.
|
[1] |
JIANG Li, ZHANG Zhi-Mo, WANG Qi-Wei, FENG Zhi-Bing, ZHANG Bo-Cheng, REN Teng-Fei. Comparative study on lithology classification of oil logging data based on different machine learning models[J]. Geophysical and Geochemical Exploration, 2024, 48(2): 489-497. |
[2] |
WANG Zong-Ren, WEN Chang, XIE Kai, SHENG Guan-Qun, HE Jian-Biao. Reservoir lithology identification method based on multi-scale time-frequency-space feature combination[J]. Geophysical and Geochemical Exploration, 2023, 47(1): 81-90. |
|
|
|
|