Predicting Protein Thermostability through Deep Learning Leveraging 3D Structural Information

In protein engineering, improving thermostability is essential for many industrial and pharmaceutical applications. However, the experimental process of identifying stabilizing mutations is time-consuming due to the enormous search space. With the increasing availability of protein structural and thermostability data, computational approaches using deep learning to identify thermostable candidates are gaining popularity. In this work, we present and benchmark a novel graph neural network, ProtGCN, that incorporates geometric and energetic details of proteins to predict changes in Gibbs free energy (ΔG), a key indicator of thermostability, upon single point mutations. Unlike conventional methods that rely on sequence or structural features, our model uses protein graphs with rich node features, carefully preprocessed from a comprehensive dataset of approximately 4149 mutated sequences across 117 protein families. In addition, ProtGCN is enhanced by incorporating embeddings from the Evolutionary Scale Modeling (ESM) protein language model into the protein graphs. This integration allows ProtGCN (ESM) to outperform comparison models, achieving competitive performance with XGBoost and a protein language model-based multi-layer perceptron on all evaluation metrics, and outperforming all models on further analyses. A strength of ProtGCN (ESM) is its ability to correctly identify and predict stabilizing and destabilizing mutations with extreme effects, which are typically underrepresented in thermostability datasets. These results suggest a promising direction for future computational protein engineering research.

Publikationsart
Wissenschaftliche Poster
Titel
Predicting Protein Thermostability through Deep Learning Leveraging 3D Structural Information
Medien
Biological Materials Science - A workshop on biogenic, bioinspired, biomimetic and biohybrid materials for innovative optical, photonics and optoelectronics applications
Band
2024
Herausgeber
TUM Campus Straubing for Biotechnology and Sustainability
Veröffentlichungsdatum
06.06.2024
Zitation
Khanna, Ashima; Haselbeck, Florian; Grimm, Dominik (2024): Predicting Protein Thermostability through Deep Learning Leveraging 3D Structural Information . Biological Materials Science - A workshop on biogenic, bioinspired, biomimetic and biohybrid materials for innovative optical, photonics and optoelectronics applications 2024.