Professional Documents
Culture Documents
Evaluation of Machine Learning Algorithms For The Classification of Lithology Using Geophysical Logs
Evaluation of Machine Learning Algorithms For The Classification of Lithology Using Geophysical Logs
Summary
Classification of subsurface formation lithology from well log data is a significant task in geoscience,
petroleum exploration and engineering. Presently, several machine learning algorithms have been
implemented for lithology classification to improve the prediction accuracy. However, due to the
complex geological conditions, such algorithms are hardly adopted for mineral deposits. In his paper,
we evaluated three popular machine learning algorithms, such as the Support Vector Machine, Random
Forest and Gradient Boosting Decision Tree. This study used the process of grid search and 10-fold
cross-validation to optimize the hyperparameters of each model and avoid overfitting. The performance
of each model is evaluated using metrics of accuracy, precision, recall and F1-score of predicted labels
of lithology against the true labels. The results show that the Gradient Boosting Decision Tree model
has better lithology classification performance, with a precision of 97.74%, recall of 98.67% and F1-
score of 98.20% among other models. The interpretation of GBDT model shows that the order of
features contributing to the lithology classification is VP >Density >Vs > natural gamma. The
study reveals that GBDT model can provide significant information for further exploration targeting of
deep mineral deposits.
Introduction
Classification of subsurface formation lithological units was traditionally conducted using methods
such as description and analysis of recovered core samples, and examination of cuttings retrieved during
drilling operations. These traditional methods are the most direct and practical. However, these
traditional methods are not always suitable because coring is costly, and coring and core recovery is
sometimes incomplete. In addition, because of complex geological conditions, different geologists may
give different interpretations, leading to uncertainty (Benaouda et al. 1999; Xie et al. 2018). Downhole
geophysical logging has been adopted to address these challenges. Downhole geophysical log data
provides high vertical resolution and good continuity of in-situ information. Also, these downhole
geophysical log data may be used to use information from the parts of the borehole with core recovery
to interpret the parts of the borehole with core loss (Benaouda et al. 1999; Xie et al. 2018). Therefore,
geophysical logs are a significant resource of subsurface rock information. However, the relationship
between the geophysical logging signatures and formation lithology is often complex.
Recently, several machine learning algorithms have been used to classify lithology using geophysical
log data. These algorithms assist the geoscientists to tackle the non-linear relationship between
geophysical logging signatures and subsurface lithologies to improve classification performance
accuracy (Xie et al. 2018). The aim of this research is to evaluate machine learning algorithms for the
classification of formation lithological units using geophysical logs from a gold deposit and compare
the performance of three popular optimized machine learning algorithms on the basis of the
computational training time and performance accuracy.
The study site is at Moab Khotsong gold mine in the Klerksdorp gold field region on the northwest
border of the Witwatersrand Basin (Figure 1). The Witwatersrand Basin is unique and of great
importance to geoscientists and explorationists. It is the oldest well-preserved, laterally extensive
successions of sedimentary basins in the world and hosts rich gold-bearing conglomerate beds (Frimmel
and Minter, 2002). The geological structure of Moab Khotsong gold mine is complex with series of
faults and intrusives cutting across igneous and metasedimentary rocks. Presently, mining at Moab
Khotsong is approaching a depth of about 4 km. Boreholes were drilled under the auspices on the
International Continental scientific Drilling Program project DSeis to investigate the nature of the fault
zone that hosted a M5.5 earthquake in August 2014 (Ogasawara et al. 2017).
Figure 1 Geological map of the Witwatersrand Basin, South Africa. The location of Moab Khotsong
gold mine is denoted by the blue circle on the enlarged map (modified after Dankert and Hein, 2010).
Hyperparameter tuning
Hyperparameter tuning is a selection process that applies a performance metric to estimate appropriate
search ranges and parameter values with the best accuracy performance metric optimal training of
each classifier model. This was achieved using a grid search method. Also, validation score curve
from 10-fold-cross validation was applied to avoid overfitting or underfitting of each models for some
hyperparameter values.
A summarized workflow of the proposed methods is shown in Figure 2 and the results in Figure 3.
1) Overall, the results shows that our supervised models have similar classification accuracies, which
is above 90%. Also, the classification of each lithology label into the correct label is above 80%
(Figure 3a). However, the results shows that the GBDT model has a better lithology classification
performance with a performance accuracy of 97.06%; a precision of 97.04%; a recall of 97.06%;
and a F1-score of 97.04%, compared to other optimized models;
2) The results demonstrates that quartzite and intrusive rocks were more accurately classified (Figure
3b). This could be due to the homogeneity in rocks, large samples size, textures, and structures.
On the other hand, siltstone had the highest misclassification rate, and was often misclassified as
quartzite. This could be because the Vp and density values of siltstone and quartzite are similar,
the heterogeneity of siltstone, as it has frequent interbedding of shale and quartz, and the samples
size of siltstone was too small for the relationship between logging and lithology to be fitted;
3) K-means clustering results enhance the knowledge of the location of significant amount of
radioactive elements (Figure 3b). Also, quartzite and intrusive rocks were more correctly
predicted without lithological information from core analysis and observation.
Conclusions
This research shows that machine learning methods can effectively classify lithologies using
geophysical log data with or without information of lithology labels from mineral deposits. In addition,
the result demonstrates that GBDT model outperformed the other two competing models by its
classification accuracy for the test set, while SVM model had the fastest computational training time.
Finally, according to the interpretation of the GBDT model, the order of the logging feature that
determine lithology in Moab Khotsong gold mine is Vp > density > Vs > natural gamma.
References
Benaouda, D., Wadge, G., Whitmarsh, R.B., Rothwell, R.G. and MacLeod, C. [1999] Inferring the
lithology of borehole rocks by applying neural network classifiers to downhole logs: an example from
the Ocean Drilling Program. Geophysical Journal International, 136(2), 477-491.
Dankert, B. T. and Hein, K. A. A. [2010] Evaluating the structural character and tectonic history of
the Witwatersrand Basin. Precambrian Research, 177(1-2), 1–22.
Friedman, J. H. [2002] Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4),
367-378.
Frimmel, H. E. and Minter, W. E. L. [2002] Recent developments concerning the geological history
and genesis of the Witwatersrand gold deposits, South Africa. Society of Economic Geologists,
Special Publication, 9, 45-117.
Ogasawara, H., Durrheim, R. J., Yabe, Y., Ito, T., van Aswegen, G., Grobbelaar, M., Funato, A.,
Ishida, A., Mngadi, S., Manzi, M. S. D. and Ziegler, M. [2017] Drilling into seismogenic zones of
M2. 0–M5. 5 earthquakes from deep South African gold mines (DSeis): establishment of research
sites. In Proceedings of ISRM AfriRock 2017, 3-5 October 2017, Cape Town, South African Institute
of Mining & Metallurgy, Symposium Series S93,+ 237-248
Xie, Y., Zhu, C., Zhou, W., Li, Z., Liu, X. and Tu, M. [2018] Evaluation of machine learning methods
for formation lithology identification: A comparison of tuning processes and model performances.
Journal of Petroleum Science and Engineering, 160, 182-193.