Alexandria 2.0: AI-Driven Discovery and Open Data Infrastructure for Materials Design
ORAL
Abstract
The Alexandria V2.0 database represents a next-generation platform for AI-assisted materials discovery, integrating generative models, graph neural networks, and universal machine-learning interatomic potentials into a unified open-science framework. In its latest expansion, Alexandria combines large-scale generative structure prediction with active-learning energy refinement to increase the yield of near-stable compounds from 36% to 99% within 100 meV/atom of the convex hull. This effort produced 1.3 million new DFT-calculated materials, including over 74 000 predicted stable phases, expanding Alexandria to 5.8 million total structures. The resulting dataset—comprising both equilibrium and 14 million out-of-equilibrium configurations—serves as a foundation for training universal machine-learning potentials and generative models. By openly releasing all data, models, and workflows under permissive licenses, Alexandria establishes a community-scale infrastructure for accelerating AI-driven materials design. Our analysis highlights emerging correlations between structure diversity, coordination topology, and stability, revealing fundamental patterns that can guide future generative models of matter.
*The Horizon Europe MSCA Doctoral network grant n.101073486, EUSpecLab, funded by the European Union, the Simons Foundation. The West Virginia Higher Education Policy Commission through the Research Challenge Grant Program (Award No. RCG 23-007, 2022). The European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program project HERO Grant Agreement No. 810451.
–
Publication: AI-Driven Expansion of the Alexandria Database, to be submitted.
Presenters
-
Aldo H Romero
- West Virginia University