Learning the determinants of interaction specificity in heterogeneous mixtures of disordered proteins
ORAL · Invited
Abstract
We present a machine-learning approach to predict the phase behavior of intrinsically disordered proteins (IDPs) in heterogeneous mixtures from their sequences. Our approach predicts partition coefficients and multicomponent phase diagrams with quantitative accuracy by identifying a low-dimensional latent space that governs the sequence- and context-dependent interactions among IDPs in solution. We apply our approach to state-of-the-art IDP force fields and demonstrate that our model accurately predicts the phase behavior observed in simulations of multicomponent mixtures. In this way, we demonstrate that the highly simplified representation of IDP sequences learned by our model is sufficient to determine their phase behavior in mixtures with completely arbitrary numbers of IDP components and compositions. Moreover, we show that Euclidean distances in this simplified representation are directly proportional to the differences in the partitioning coefficients of IDP sequences in arbitrary mixtures. This property of our model establishes a physically meaningful method to cluster IDP sequences, predict the effects of sequence mutations, and perform coevolutionary analyses. Our approach therefore provides a generalizable framework to describe IDP partitioning specificity that could be straightforwardly applied to high-throughput experiments.
*This work was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R35GM155017.
–
Presenters
-
William M Jacobs
- Princeton University