Bulletin of the American Physical Society
APS March Meeting 2023
Volume 68, Number 3
Las Vegas, Nevada (March 5-10)
Virtual (March 20-22); Time Zone: Pacific Time
Session T08: Physics of Proteins III: Evolution and Function of Molecular InteractionsFocus
|
Hide Abstracts |
Sponsoring Units: DBIO Chair: Xiaoqin Zou, University of Missouri Room: Room 131 |
Thursday, March 9, 2023 11:30AM - 12:06PM |
T08.00001: Exploring protein biophysics with deep learning Invited Speaker: Claus Wilke Deep learning approaches are becoming increasingly useful for studying protein biophysics. For example, AlphaFold is famously one of the best currently available tools for predicting protein structure from sequence. Beyond structure prediction, deep learning approaches can help to predict mutational effects, protein function, or ligand binding. In all these applications, from a physics perspective, the primary challenge is to understand what the biophysical meaning is of the predictions produced by machine learning, how the machine learning algorithms make their predictions, and how we can curate and select the right training data for obtaining good results. Here, I will discuss several projects in this field that we are currently pursuing in my lab. First, I will describe how machine learning methods can be used to identify sites that are primed for mutation. Second, I will discuss the differences between models trained purely on sequence data versus on structure data. Finally, I will demonstrate how protein embedding models can be used to search sequence data bases for proteins with specific biophysical characteristics. |
Thursday, March 9, 2023 12:06PM - 12:18PM |
T08.00002: GENERALIST: Generative Probabilistic Non-Linear Tensor Factorization Model for Proteins Hoda Akl, Brooke Emison, Xiaochuan Zhao, Purushottam Dixit Exploring the space of functional protein sequences beyond the naturally occurring ones requires generative models that leverage known natural sequences to learn the correlations between amino acid positions. For large protein sequences with datasets of limited sample size, inference of the protein sequence space could be challenging or infeasible. To address this gap, we present GENERALIST: a generative probabilistic model for protein sequences based on tensor factorization. GENERALIST infers a lower dimensional latent representation of the natural sequences which can then be used to generate novel sequences. The generated ensemble conserves several higher order statistics in the natural alignment. Additionally, GENERALIST also reproduces the statistics of the sequence ensemble, including distribution of nearest neighbor distances. Computational assessment of the sequence ensemble using AlphaFold2 suggests that the ensemble comprises structurally stable sequences. The model complexity in GENERALIST is tunable using the dimension of the latent space which allows us to control the tradeoff between accuracy and generality. This way, GENERALIST addresses the limitations of state of art generative models; the model accuracy is robust against the size of the natural protein sequence alignment and the length of the sequence. Notably, our framework is applicable to all types of categorical data including nucleotide sequences and binary data such as presence/absence of genes in genomes, neuronal spikes, etc. |
Thursday, March 9, 2023 12:18PM - 12:30PM |
T08.00003: Analyses of the cores of AlphaFold2 protein structure predictions Jillian Belluck, Alex T Grigas, Corey S O'Hern Developing computational methods to accurately predict the three-dimensional structure of a protein from its primary sequence of amino acids is an important and unsolved problem. AlphaFold2, a deep learning methodology developed by DeepMind to generate computational models of proteins, has been successful in recent Critical Assessment of protein Structure Prediction competitions. In the present work, we assess AlphaFold2 computational models using the number of residues in the core, a feature that is strongly correlated with protein stability. We find that while AlphaFold2's predictions for the E. coli proteome resemble X-ray crystal structures, the eukaryotic protein predictions contain too few core residues. Our analysis considers the influence of intrinsically disordered sequences on the fraction of core residues, using both AlphaFold2's per-residue confidence levels and the average charge and hydrophobicity of each protein. The variability in the core size of AlphaFold2's predictions across organisms demonstrates that while machine learning methods have increased the accuracy of computational models for protein structure, significant improvements must be made to achieve results comparable to those in experiments. |
Thursday, March 9, 2023 12:30PM - 12:42PM |
T08.00004: Evaluating Machine Learning Techniques for Decoy Detection of Protein-Protein Interactions Naomi Brandt, Alex T Grigas, Lynne Regan, Corey S O'Hern Generating accurate computational models for protein-protein interfaces (PPIs) and determining the quality of these models remains a significant challenge. Over the past two decades, several methods have been developed to generate and score PPIs. There are two main approaches for PPI scoring: physics-based forcefields that include protein stereochemistry and van der Waals and electrostatic interactions, and knowledge-based scoring functions that based on experimentally determined PPI structures from the Protein Data Bank. With advances in machine learning, neural networks can also be used for PPI model generation and scoring. |
Thursday, March 9, 2023 12:42PM - 12:54PM |
T08.00005: Local deformations of proteins through molecular simulations reveal allosteric couplings with implications for drug design Fabian Byléhn, Juan J De Pablo, Gustavo R Perez Lemus, Cintia A Menendez, Walter Alvarado Allosteric regulation is an important property of proteins with many applications in drug design, yet is notoriously difficult to characterize for any general protein. Many proteins show very subtle conformational changes upon allosteric perturbations, with local changes that are not captured by global metrics such as Root Mean Squared Deviation. This poses a challenge for drug design, where subtle allosteric changes induced by drug binding are missed and the computed efficacy of drugs is mischaracterized. We show that the more natural language to describe conformational changes is a local metric based on an elastic strain formalism, that is able to capture local deformations induced by allosteric perturbations such as drug/peptide binding in Molecular Dynamics simulations. The shear strain tensor is calculated upon binding and reveals previously unknown allosteric sites and allosteric mechanisms. In particular, we find that through this formalism, we are able to explain the mechanisms of repurposed drugs against key proteins of the SARS-CoV-2 proteome, and uncover previously unknown binding sites that can be exploited in drug design. This methodology paves the way for the design of new allosteric drugs to tackle diseases that are hard to target through drugs that act at the functional site. |
Thursday, March 9, 2023 12:54PM - 1:06PM |
T08.00006: Characterising the intrinsically disordered region of ORF6 from SARS-CoV-2 Alice J Pettitt, Lydia Newton, Stephen McCarthy, Alethea B Tabor, Gabriella T Heller, Christian D Lorenz, D. Flemming Hansen Many viral proteins have flexible and disordered regions that lack a well-defined tertiary structure. These disordered regions are often functionally important in immune evasion and for rapid replication. One such viral protein is the 61-residue protein ORF6, from SARS-CoV-2. ORF6 is a potent interferon antagonist that has been shown to bind to the ribonucleic acid export 1 and GLEBS motif of nucleoporin 98 (Rae1-Nup98) heterodimer via its C-terminal region. The binding of ORF6 to the Rae1-Nup98 heterodimer prevents nuclear export of cellular mRNAs, which suppresses the antiviral immune response. |
Thursday, March 9, 2023 1:06PM - 1:42PM |
T08.00007: Evolution of the Structure and Function of the Cyanobacterial Orange Carotenoid Protein and its Quenching of the Cyanobacterial Light Harvesting Antenna Invited Speaker: Cheryl Kerfeld In contrast to those of plants, the photoprotective mechanisms of cyanobacteria have only recently begun to be characterized. One of the most prevalent, involving the Orange Carotenoid Protein (OCP), a photoreceptor, dissipates excess energy captured by the light harvesting antenna (phycobilisome or PBS). The OCP is a soluble, 34 kDa protein that binds a single carotenoid molecule. It is the only known photoactive protein that uses a carotenoid as its sole chromophore. The crystal structure of the OCPO shows that the protein is comprised of two structural domains: a carotenoid-binding N-terminal domain (NTD), unique to cyanobacteria, and a C-terminal domain (CTD) with superficial structural similarity to BLUF and LOV domains. The carotenoid spans the two domains. The absorption of blue-green light causes the OCP to convert from a dark stable orange form, OCPO, to a light-activated red form, OCPR. Structurally the photoactivation is characterized by a 12Å shift in the position of the carotenoid and, as recently revealed by our Cryo-EM structure of the quenching complex between the OCP and the PBS, a 60Å/220 degree rotation of the CTD. The structure of the OCPR -PBS complex also provides a high-resolution structural description showing how four 34kDa OCPs, each with a single carotenoid, are able to quench the 6.3MDa PBS with its 396 bilin pigments. In conjunction with analysis of genomic sequence data from ecophysiologically diverse cyanobacteria we find a variety of carotenoproteins that are single-domains homologs of the OCP. Collectively our observations suggest a model for the evolution of OCP-mediated photoprotection and provide a framework for co-opting elements of the OCP structurally and functionally for the development of optogenetic and artificial photosynthesis systems. |
Thursday, March 9, 2023 1:42PM - 1:54PM |
T08.00008: The Molecular Origin of Various DNA-repair Quantum Yields in Photolyases Chao Yang Photolyase (PL) is a blue-light-activated flavoenzyme that use FADH- as the catalytic cofactor to repair UV-induced DNA lesions including cyclobutene pyrimidine dimers (CPDs) and pyrimidine-pyrimidone (6-4) photoproducts (6-4 PP). Different classes of CPD photolyases show diverse genetic sequences but have similar folding structure. Class I CPD photolyases from bacteria have much higher repair quantum yields than class II CPD photolyases from plants. The difference mainly comes from a bifurcation in initial electron transfer: class I CPD photolyases mainly use a tunnelling pathway (electron from the isoalloxazine ring directly tunnels to the CPD substrate), while class II CPD photolyases mainly use a two-step hopping pathway (electron first jumps to the adenine and then jumps to the CPD substrate). In this study, we switched two key residues in the active sites of class I (N341, R342 in EcPL from Escherichia coli) and class II photolyases (G381, F382 in AtPL from Arabidopsis thaliana). Steady-state repair quantum yield measurements show dramatic lower repair quantum yields in EcPL mutants compared to EcPL while a different trend is observed in AtPL and its mutants. To reveal how these residues affect the repair reaction, ultrafast laser spectroscopy was used to determine the reaction rates of seven electron-transfer reactions in 10 elementary steps. We found that the repair quantum yields can be tuned by favoring either electron tunnelling or hopping channel, and thus adjusting the quantum yield of electron injection to the CPD substrate. Photolyases evolved to have high affinity and specificity towards DNA lesion with reasonably repair quantum yield. Solely increasing repair quantum yield by single mutation may disrupt the delicate balance between substrate binding and CPD repair. |
Thursday, March 9, 2023 1:54PM - 2:06PM |
T08.00009: Many-body van der Waals forces, polarization response and dynamical effects in poly-peptides Mario Galante, Alexandre Tkatchenko The modeling of conformations and dynamics of supramolecular systems is of primary importance for understanding physicochemical properties of soft matter. Although short-range interactions such as covalent and hydrogen bonding control the local molecular arrangements, non-covalent interactions play a dominant role in determining the global character of the conformations. The many-body dispersion (MBD) approach enables the inclusion of non-pairwise contributions that consistently yield more accurate energies and longer ranged forces than standard Lennard-Jones-like potentials. Here we focus on the signatures of such many-body forces on the dynamical properties of small peptides, both in terms of simplified backbone models and for a 15-residue polyalanine within semiempirical quantum mechanics. We show that beyond-pairwise terms consistently yield a decreased roughness of the energy landscape and more compact, globally optimized conformations [arXiv:2110.06646]. This is intimately related to the delocalization of the force contributions that derives from the higher versatility of the polarization response tensor. We therefore focus our analysis on such response properties, discussing coarse-graining strategies towards the formulation of MBD polarizabilities in terms of fragments, rather than atoms. |
Thursday, March 9, 2023 2:06PM - 2:18PM |
T08.00010: Examining dynamic allostery and communication in proteins via machine learning and statistical analysis Freddie R Salsbury, Dizhou Wu We will present results from applying machine learning and statistical techniques to the analysis of molecular dynamics simulations with a particular emphasis on understanding how ensemble change and how different regions of proteins, and protein complexes move and potentially communicate. We focus on thrombin as a particularly interesting protein from biomedical and physical viewpoints. |
Thursday, March 9, 2023 2:18PM - 2:30PM |
T08.00011: Investigating decoy detection for protein-protein interaction models using state-of-the-art scoring methods and a novel graph neural network Jacob Sumner, Grace Meng, Alex T Grigas, Corey S O'Hern Computational prediction and design of proteins is a difficult task that results in models with a wide variation in quality. Decoy detection algorithms seek to classify computational models as high-quality or low-quality without knowledge of the experimental structures. Recently, dramatic improvements have been made in decoy detection of models for single proteins, but decoy detection of models of protein-protein interfaces (PPI) remains challenging. To assess the current state-of-the-art for PPI decoy detection, we scored computational models generated from RosettaDock, ZDOCK, and HDOCK from a dataset of 32 heterodimeric proteins (with high-resolution x-ray crystal structures) against a standard measure of similarity to the x-ray crystal structure. We found that for some targets, the decoy scores were strongly correlated to the structural similarity scores. However, for other targets nearly all decoy scores were not correlated with the structural similarity scores, which indicates the importance of improving PPI scoring functions. To improve PPI decoy detection, we developed a graph attention neural network model. The model creates a graph using the amino acids as nodes and node features determined using natural language processing on the amino acid sequence. We show results for PPI decoy detection after training the model on the Dockground 1.0 and ZDock decoy datasets, totaling over 170 unique heterodimers. |
Follow Us |
Engage
Become an APS Member |
My APS
Renew Membership |
Information for |
About APSThe American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics. |
© 2024 American Physical Society
| All rights reserved | Terms of Use
| Contact Us
Headquarters
1 Physics Ellipse, College Park, MD 20740-3844
(301) 209-3200
Editorial Office
100 Motor Pkwy, Suite 110, Hauppauge, NY 11788
(631) 591-4000
Office of Public Affairs
529 14th St NW, Suite 1050, Washington, D.C. 20045-2001
(202) 662-8700