Metrics for Evaluating Stochastic Outputs in Machine Learning Models: Addressing Accuracy and Uncertainty

2025-12-25

Yushi Deng, Mario R. Eden, Selen Cremaschi,
Metrics for Evaluating Stochastic Outputs in Machine Learning Models: Addressing Accuracy and Uncertainty,
Computers & Chemical Engineering,
2025,
109474,
ISSN 0098-1354,
https://doi.org/10.1016/j.compchemeng.2025.109474.
(https://www.sciencedirect.com/science/article/pii/S0098135425004776)
Abstract: Chemical processes are modeled and designed considering uncertainties in system parameters, operating conditions, and environmental factors. Models with stochastic outputs are commonly employed to make decisions in chemical engineering designs. Evaluating models with stochastic outputs requires assessing both prediction accuracy and precision. The area metric measures the overall mismatch between the prediction and observation, but does not provide a metric to assess precision and accuracy separately. Previously, we introduced uncertainty width, which decomposes the area metric into precision and bias components for model outputs whose distributions are symmetric. In this work, we investigate the applicability and effectiveness of the uncertainty width to asymmetric output distributions through a series of computational experiments. We then further study the application of the uncertainty width to evaluate the performance of eight distinct machine learning techniques, each integrated within a hybrid modeling framework, for predicting liquid entrainment with its uncertainty across three different flow orientations. The results of the computational experiments suggest that the uncertainty width is effective for asymmetric cases, especially when bias is small. Further studies are needed to understand the effectiveness of uncertainty width in cases of large bias and extreme asymmetry. The results for the hybrid models support the effectiveness of uncertainty width. They reveal that the Gaussian Process model has the best overall prediction accuracy. The remaining models exhibit diverse trade-offs between precision and accuracy, indicating that model selection should be guided by the specific accuracy and precision requirements for the application.
Keywords: Metrics; Machine Learning; Area Metric; Accuracy; Precision; Uncertainty