Improving Solvation Energy Predictions in VASPsol using Machine Learning

ORAL

Abstract

Density functional theory (DFT) can accurately predict material properties and reaction barriers. However, due to high computational costs, DFT is often limited to small system sizes. This is especially true in liquid-phase calculations, where free-energy barriers can vary dramatically compared to the gaseous state. To approximate the solvation effect, computational chemists use continuum models to mimic the countless number of solvent molecules in these systems. Continuum models attempt to capture the effect of the solvent on solute molecules and surfaces while dramatically reducing the computational cost. VASPsol uses a polarizable continuum model within VASP, a plane-wave DFT code. We trained machine learning models on VASPsol solvation energies and the accompanying prediction errors relative to experimental data for 450 molecules solvated in water, sourced from the Minnesota Solvation Database. Inputs for each molecule were constructed using the COSMO-SAC descriptors of the solute and the COSMO-SAC predicted Infinite Dilution Activity Coefficient (IDAC). We conduct two separate tasks. 1. Using neural networks to improve the solvation energy prediction of VASPsol and 2. Using the VASPsol solvation energy error as the target to examine features contributing to decreased VASPsol accuracy. Through permutation feature importance analysis, we identified positively charged surface segments as key contributors to errors in VASPsol energy predictions.This research provides insights into the limitations of VASPsol and opens avenues for its improvement, promising better accuracy in DFT simulations in solvent environments.

Presenters

  • Eric C Fonseca

    University of Florida

Authors

  • Eric C Fonseca

    University of Florida

  • Richard G Hennig

    University of Florida

  • Sean Florez

    University of Florida