Identification of key predictors of acute GVHD in pediatric acute Leukemia using machine learning methods

2025-11-06

İlknur Buçan Kırkbir, Hacer Kobya Bulut,
Identification of key predictors of acute GVHD in pediatric acute Leukemia using machine learning methods,
Transplant Immunology,
Volume 93,
2025,
102318,
ISSN 0966-3274,
https://doi.org/10.1016/j.trim.2025.102318.
(https://www.sciencedirect.com/science/article/pii/S0966327425001467)
Abstract: Backround
Hematopoietic stem cell transplantation (HSCT) is a crucial treatment for leukemia. Allogeneic Hematopoietic cell transplantation (HCT), in which stem cells from a healthy donor are used, carries significant risks, including graft versus host disease (GVHD), a severe complication that leads to high morbidity and mortality. This study aimed to identify significant predictors of acute GVHD (aGVHD) in pediatric patients with acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML).
Methods
This retrospective study analyzed the predictors of aGVHD in pediatric patients using different machine learning methods: Random Forest (RF), Logistic Regression (LR), and Boruta. The dataset was obtained from the UCI machine learning open-source database, and after balancing the class distribution of the dependent variable aGVHD using the Synthetic Minority Oversampling Technique (SMOTE), a total of 124 pediatric patients, including demographic and clinical variables, were analyzed using the R open-source programming language.
Results
Significant differences were observed between the aGVHD and non-aGVHD groups in recipient age (p = 0.002), recipient weight (p = 0.02), donor age (p = 0.01), allele mismatch (p = 0.01), antigen mismatch (p = 0.01), and recipient CMV status (p = 0.04). Feature importance analyses using Random Forest (RF) and Boruta identified recipient age, weight, and donor age as the most influential predictors of aGVHD. Logistic Regression (LR) highlighted recipient Rh-positive status and donor blood group B as additional relevant factors, offering a complementary perspective. These findings may assist in risk assessment and the development of preventive strategies for aGVHD.
Conclusion
Machine learning methods effectively identified important predictors of aGVHD, demonstrating their potential to improve post-HCT care and preventive protocols. Our results underscore the complex aGVHD etiology, suggesting the involvement of various factors in its development. Further studies should focus on integrating these findings into the clinical practice to enhance patient outcomes.
Keywords: Acute graft-versus-host disease; Hematopoietic stem cell transplantation; Machine learning