Mitigating heavy tails in error distributions of MLIPs by optimizing training strategies

Young-Jae Choi; Lucas Wagner

Mitigating heavy tails in error distributions of MLIPs by optimizing training strategies

Oral-In-person

Abstract

Machine-learning interatomic potentials (MLIPs) have become ubiquitous in materials simulations owing to their excellent scalability with system size and ability to achieve chemical accuracy. Despite these successes, instabilities in molecular dynamics (MD) simulations have been reported, including unphysical abrupt melting [1], dissociation [2], or segregation [3]. These failures are triggered by large errors residing in the heavy tails of error distributions [4].

To identify conditions that mitigate these heavy tails, we trained ensembles of NequIP [5] MLIPs for silicon at 300 K, systematically varying model complexity, training dataset size, and the broadness of potential energy surface (PES) sampling for training. The joint distributions of estimated errors (computed via ensemble variance) and actual prediction errors on the test set not only reveal the strategy to realize the optimal training conditions but also deepen the understanding of the origin of the heavy-tailed error distributions. Models with insufficient complexity exhibited underfitting, characterized by weak correlations between estimated and actual errors, indicating overconfidence in predictions regardless of their actual accuracy. In contrast, optimally fitted models displayed strong correlations between estimated and actual errors. Models with excessive complexity and small training datasets often showed overfitting, marked by error distributions with extremely heavy tails. Furthermore, sampling training data from a broader PES (training on 1200 K, testing on 300 K) halved the extent of the heavy tails compared to the tight sampling of training data (training and testing on 300 K), even under optimal model complexity and dataset size.

March 16, 2026, 10:24 AM – March 16, 2026, 10:36 AM

Publication: [1] Y. Liu, X. He, and Y. Mo, Discrepancies and error evaluation metrics for machine learning interatomic potentials, npj Computational Materials 9, 174 (2023).
[2] J. George, G. Hautier, A. P. Bart´ok, G. Cs´anyi, and V. L. Deringer, Combining phonon accuracy with high transferability in gaussian approximation potential models, The Journal of Chemical Physics 153 (2020).
[3] Y.-J. Choi and S.-H. Jhi, Efficient training of machine learning potentials by a randomized atomic-system generator, The Journal of Physical Chemistry B 124, 8704 (2020).
[4] P. Pernot, B. Huang, and A. Savin, Impact of non-normal error distributions on the bench- marking and ranking of quantum machine learning models, Machine Learning: Science and Technology 1, 035011 (2020).
[5] S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials, Nature communications 13, 2453 (2022).

Presenters

Young-Jae Choi
- University of Illinois at Urbana-Champaign

Authors

Young-Jae Choi
- University of Illinois at Urbana-Champaign
Lucas Wagner
- University of Illinois at Urbana-Champaign