Revisiting China’s energy poverty: Machine learning insights into key predictors and nonlinear relationships
Zhiqun Li, Hanol Lee,
Revisiting China’s energy poverty: Machine learning insights into key predictors and nonlinear relationships,
Sustainable Futures,
Volume 10,
2025,
101225,
ISSN 2666-1888,
https://doi.org/10.1016/j.sftr.2025.101225.
(https://www.sciencedirect.com/science/article/pii/S2666188825007877)
Abstract: This study examines the determinants of multidimensional energy poverty (EP) in China from 2012 to 2020 using an integrated approach that combines econometric analysis with machine-learning techniques. Using data from the China Family Panel Survey, this study constructs a Multidimensional Energy Poverty Index that captures both accessibility and affordability. Traditional regression analysis identified key predictors including income level, education, family size, and hukou status. To address the limitations of the linear models, particularly their failure to detect nonlinear patterns and their poor handling of missing data, this study applies extreme gradient boosting and interpreted the results using Shapley Additive Explanations. The machine learning approach demonstrated superior predictive performance, with an R-squared of 0.482 compared to 0.202 for traditional regression. This approach revealed that household income, family size, education, and hukou status are the most important predictors of EP. Notably, household income emerged as the dominant factor, with a feature importance score of 0.125, accounting for approximately 47.5 % of the combined importance of all predictors, underscoring its critical role in determining EP. Importantly, the results show complex nonlinear relationships that the linear regression model fails to fully capture, providing a deeper understanding of how these factors contribute to EP. These insights provide robust data-driven guidance for policymakers seeking to mitigate EP in China through income support, educational programs, and rural infrastructure development.
Keywords: Multidimensional energy poverty; Determinants of energy poverty; Explainable machine learning