Learning biophysical energy functions from protein structure data with physically-informed equivariant neural networks
ORAL
Abstract
Understanding protein structure and function is crucial both for our understanding of biology and for the development of a host of medical and non-medical technologies. To this end, machine learning (ML) has been a driving force. However, data that links protein structure with function is relatively sparse, which makes training robust models difficult, and many current ML approaches can only work with a simplified representation of proteins. Here we show that a rotationally symmetric neural network trained on protein structure data, which has demonstrated the ability to learn an effective biophysical potential from data, can be used to modify input coordinates of a protein structure to reflect new desired outputs. Specifically, when combined with our energy model, the input points can be optimized with respect to the network output, providing a means to relax atomic environments (e.g. to accommodate for novel mutations in a protein). Using a gradient based optimization scheme, we can reconstruct 3D coordinate sets ranging from 30-100 atoms with an accuracy of 1-2 angstrom RMSD. Such a differentiable, invertible energy-based model would provide a strong foundation to learn a biophysical energy function for proteins, which could be used to study open problems in protein research like predicting mutational effects and binding affinity.
* This work has been supported by the National Institutes of Health MIRA award (R35 GM142795), and the CAREER award from the National Science Foundation (grant No: 2045054)
–
Presenters
-
Kevin A Borisiak
University of Washington
Authors
-
Kevin A Borisiak
University of Washington
-
Armita Nourmohammad
University of Washington
-
Michael N Pun
University of Washington
-
Gian Marco Visani
University of Washington