Manifold Learning of Collective Variables for Enhanced Sampling Simulations

ORAL

Abstract

Among the main challenges in atomistic simulations of biomolecular systems is the so-called sampling problem or rare event problem, where proper sampling of energy landscapes is impeded by high kinetic barriers hindering transitions between metastable states on typical simulation time scales. Numerous enhanced sampling methods have been developed to address this problem and more efficiently sample rave event systems. Many such enhanced sampling methods work by identifying a few slow degrees of freedom, termed collective variables (CVs), and enhancing the sampling along these CVs. However, selecting CVs to analyze and drive the sampling is not trivial and often relies on chemical intuition. Machine learning (ML) methods, in particular dimensionality reduction or manifold learning methods, provide a possible solution to this issue.

Here, we present a manifold learning method called multiscale reweighted stochastic embedding (MRSE) for automatically constructing CVs to represent and drive the sampling of free energy landscapes in enhanced sampling simulations. The technique automatically finds CVs by learning a low-dimensional embedding of the high-dimensional feature space to the latent space via a deep neural network. Our work builds upon the popular t-distributed stochastic neighbor embedding (t-SNE) approach and addresses known limitations of t-SNE.

We introduce several new aspects to stochastic neighbor embedding algorithms that make MRSE suitable for enhanced sampling simulations, including: (1) a well-tempered landmark selection scheme; (2) a multiscale representation of the high-dimensional feature space; and (3) a reweighting scheme to account for biased training data. We show the performance of MRSE by applying it to several molecular systems. Furthermore, we present a theoretical motivation behind our reweighting scheme and show how it can be generalized to other manifold learning techniques.

Publication: - J. Rydzewski and O. Valsson, "Multiscale Reweighted Stochastic Embedding: Deep Learning of Collective Variables for Enhanced Sampling", J. Phys. Chem. A, 125, 6286 (2021) - DOI: 10.1021/acs.jpca.1c02869
- J. Rydzewski, M. Chen, T. K. Ghosh, and O. Valsson, "Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations", J. Chem. Theory Comput. 18, 7179-7192 (2022) - DOI: 10.1021/acs.jctc.2c00873
- J. Rydzewski, M. Chen, and O. Valsson, "Manifold Learning in Atomistic Simulations: A Conceptual Review", Mach. Learn.: Sci. Technol. 4, 031001 (2023) - DOI:10.1088/2632-2153/ace81a

Presenters

  • Omar Valsson

    University of North Texas

Authors

  • Omar Valsson

    University of North Texas