Optimizing Feature Space for Small or Lower-Quality Data: A Case-Study in Charge Carrier Mobility

ORAL

Abstract

Artificial intelligence (AI) creates models that can accelerate the discovery of functional materials [1]. An open question is selecting the relevant materials features (descriptive parameters that characterize the material, that should be used to represent the material's function of interest, especially when there is a paucity of good-quality data. Here we present a method that uses feature importance metrics such as the SHAP values [2, 3], to select an optimal set of input features for a given problem. We then use this procedure to train better models for electron mobility, using a dataset of 64 materials, with experimentally determined electron mobilities and 23 computationally generated inputs. From here we find a subset of four features that generate best model across multiple regression techniques. The final set of models is then analyzed to find the regions of material space where high electron mobilities are expected.

[1] S. Bauer, et al. Modelling Simul. Mater. Sci. Eng. 32 063301 (2024)

[2] K. Aas, M. Jullum, and A. Løland. Artif. Intell. 298, 103502 (2021)

[3] T. A. R. Purcell, M. Scheffler, L. M. Ghiringhelli, C. Carbogno npj Comput. Mater. 9, 112 (2023)

*Funded by TEC1p (ERC Advanced Grant Nº 740233) and the University of Arizona College of Science.

Presenters

  • Thomas A R Purcell

    • University of Arizona

Authors

  • Thomas A R Purcell

    • University of Arizona
  • Yi Yao

    • The NOMAD Laboratory at the FHI of the MPS and MS1P e.V. Berlin
  • Raushan Anjum

    • University of Arizona
  • Matthias Scheffler

    • The NOMAD Laboratory at FHI, Max Planck Society