A Graph Attention Autoencoder for Predicting Protein Structure

ORAL

Abstract

Understanding how local interactions give rise to global protein structure remains a central challenge in molecular biophysics.  Currently, due to computational costs, all-atom molecular dynamics simulations are not able to fold many monomeric proteins and protein complexes.   Therefore, we developed a Graph Attention Autoencoder (GATE) that learns compressed, physically meaningful representations of proteins directly from experimental data on protein structure. Each amino acid is represented as a node, while edge features encode the shared surface area of Voronoi polyhedra, ΔrSASA between neighboring amino acids, and differences between their electrostatic charge. GATE was trained on >2,500 high-quality x-ray crystal structures and >10,000 synthetic “pseudo-proteins” generated from coarse-grained models of proteins. This framework produces low-dimensional representations of proteins that maintain essential physical features and can reconstruct full protein structures with high accuracy. The GATE model recovers much of the structural complexity of natural proteins, providing a foundation for future applications in predicting the structure of protein-protein complexes.

*Acknowledgement: NIH Training Grant T32GM145452.

Presenters

  • Jacob Sumner

    • Yale University

Authors

  • Jacob Sumner

    • Yale University
  • Naomi Brandt

    • Yale University
  • Corey S OHern

    • Yale University