Real-time and explainable rock mass classification under imbalanced tunnel boring machine data using hybrid resampling and ensemble learning
Rui Li, Junlong Yan, Yueji He, Shaoxuan Guo, Qingsong Zhang, Rentai Liu, Yanyi Liu, Xuanyue Feng, Xin Chen,
Real-time and explainable rock mass classification under imbalanced tunnel boring machine data using hybrid resampling and ensemble learning,
Engineering Applications of Artificial Intelligence,
Volume 162, Part D,
2025,
112641,
ISSN 0952-1976,
https://doi.org/10.1016/j.engappai.2025.112641.
(https://www.sciencedirect.com/science/article/pii/S0952197625026727)
Abstract: The construction safety and efficiency of Tunnel Boring Machines (TBMs) are highly dependent on the accurate identification of surrounding rock mass grades. This study develops a data-driven rock mass prediction model using tunneling parameters collected from the Yinchao Jiliao diversion project. Mutual information coefficient, spearman correlation analysis, and kernel density estimation were comprehensively applied to identify the most relevant statistical features derived from key tunneling parameters that are associated with surrounding rock classes. Seven individual models and three ensemble learning models were established, with hyperparameters optimized via a Tree-structured Parzen Estimator (TPE) based Bayesian algorithm and stratified five-fold cross-validation. To address the core challenges of highly imbalanced sample distribution and inter-class feature overlap, this study introduced Synthetic Minority Over-sampling Technique (SMOTE) and SMOTE-Tomek for data preprocessing. Considering the asymmetric risk associated with misclassification of different rock mass grades in practical tunneling engineering, a risk preference metric termed High-Risk Average Recall (HRAR) was proposed to evaluate model, prioritizing the prevention of misclassifying high-risk rock masses (Class IV and V) as low-risk rock masses (Class II and III). Based on comprehensive metrics, the SMOTE-Tomek-preprocessed Soft-Voting ensemble model achieved superior macro-average performance and high HRAR value. To enhance model transparency and credibility, SHapley Additive exPlanations (SHAP) was employed for explainability analysis. This method elucidated the contribution and influence of key features (thrust, torque, advance rate) on rock mass classification across different models. This study provides a systematic solution and technical foundation for geological perception and risk early-warning in intelligent TBM tunneling.
Keywords: Tunnel boring machine; Rock mass classification; Ensemble machine learning; Bayesian optimization; Imbalanced data; Shapley additive explanations