Beyond the Benchmark: Bridging MLIP Performance and Practical Materials Discovery

ORAL  · Invited

Abstract

For decades, we have imagined using computation to design materials rather than merely explain them. With today’s rapid progress in density functional theory (DFT) and machine-learned interatomic potentials (MLIPs), this dream seems closer than ever - but what are the potential pitfalls? In this talk, I will explore this question through concrete data from our recent Matbench-Discovery benchmark, which systematically evaluates how well current atomistic ML models can identify new, stable materials across chemical space. I will discuss the difficulties of using regression metrics for classification-like discovery tasks and the challenges of testing true out-of-distribution performance in models trained on very extensive datasets.

As performance on standard benchmarks begins to saturate, it becomes essential to ask how further improvements translate to real progress in the materials design pipeline. In particular, I will outline how MLIPs have expanded our predictive reach yet remain constrained by fundamental questions: what exactly should we compute, and when does improved accuracy stop yielding better design outcomes without first expanding the underlying theoretical framework? Finally, I will highlight emerging directions: autonomous discovery loops that combine MLIPs, large-scale sampling, and foundation models that aim to move beyond property prediction toward reasoning about how to use computational tools to solve practical materials science problems.

Presenters

  • Anubhav Jain

    • Lawrence Berkeley National Laboratory
    • LBNL

Authors

  • Anubhav Jain

    • Lawrence Berkeley National Laboratory
    • LBNL