Predicting literacy intervention responsiveness using semi-supervised machine learning
Amanda Swee-Ching Tan, Farhan Ali, Chiew Lim Lee, Kenneth K. Poon,
Predicting literacy intervention responsiveness using semi-supervised machine learning,
Research in Developmental Disabilities,
Volume 165,
2025,
105090,
ISSN 0891-4222,
https://doi.org/10.1016/j.ridd.2025.105090.
(https://www.sciencedirect.com/science/article/pii/S089142222500174X)
Abstract: Background
There is pervasive non-responsiveness to systematic phonics interventions which have furthermore tended to focus on near-transfer outcomes related to phonology. There is a need to predict intervention responsiveness related to far transfer outcomes such as literacy-relevant word reading and spelling. Furthermore, there is potential for the use of advanced machine learning to maximize predictive power.
Aims
This study aims to longitudinally predict systematic phonics intervention using machine learning models.
Method
The sample included children with special educational needs (M = 98.08 months, N = 838) who either received long-term intervention (average duration of 33.62 months) (labeled data) or only had baseline data without intervention (unlabeled data). We applied 12 semi-supervised learning models learned from the mix of labeled and unlabeled data to predict intervention responsiveness outcomes of word reading and spelling. Predictors were background information, domain-general cognitive abilities, and language-related achievement scores, with expanded predictors consisting of differences among these predictors.
Results
Amongst 12 models developed, Random Forest and Gaussian Naïve Bayes models achieved the highest F1 score of 0.7 in the test set, supported by the incorporation of unlabeled data and expanded predictors. The top predictors were related to verbal comprehension, visual memory, and verbal working memory.
Conclusions
We identified important predictors of intervention responsiveness and showed the promise of machine learning models with implications on the allocation of resources, mitigation of risk of failure, and tailoring of interventions.
Keywords: Children; Dyslexia; Literacy; Intervention; Semi-supervised machine learning