bims-lances Biomed News
on Landscapes from Cryo-EM and Simulations
Issue of 2024–08–11
four papers selected by
James M. Krieger, National Centre for Biotechnology



  1. Sci Rep. 2024 Aug 05. 14(1): 18149
      Cryogenic electron microscopy (cryo-EM) has emerged as a powerful method for the determination of structures of complex biological molecules. The accurate characterisation of the dynamics of such systems, however, remains a challenge. To address this problem, we introduce cryoENsemble, a method that applies Bayesian reweighting to conformational ensembles derived from molecular dynamics simulations to improve their agreement with cryo-EM data, thus enabling the extraction of dynamics information. We illustrate the use of cryoENsemble to determine the dynamics of the ribosome-bound state of the co-translational chaperone trigger factor (TF). We also show that cryoENsemble can assist with the interpretation of low-resolution, noisy or unaccounted regions of cryo-EM maps. Notably, we are able to link an unaccounted part of the cryo-EM map to the presence of another protein (methionine aminopeptidase, or MetAP), rather than to the dynamics of TF, and model its TF-bound state. Based on these results, we anticipate that cryoENsemble will find use for challenging heterogeneous cryo-EM maps for biomolecular systems encompassing dynamic components.
    Keywords:  Cryo-EM; Molecular dynamics simulations; Statistical inference; Structural biology; Trigger factor
    DOI:  https://doi.org/10.1038/s41598-024-68468-7
  2. Proc Natl Acad Sci U S A. 2024 Aug 13. 121(33): e2318951121
      An increasingly common viewpoint is that protein dynamics datasets reside in a nonlinear subspace of low conformational energy. Ideal data analysis tools should therefore account for such nonlinear geometry. The Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of geometries that can be modeled after an energy landscape. Second, many standard data analysis tools developed for data in Euclidean space can be generalized to Riemannian manifolds. In the context of protein dynamics, a conceptual challenge comes from the lack of guidelines for constructing a smooth Riemannian structure based on an energy landscape. In addition, computational feasibility in computing geodesics and related mappings poses a major challenge. This work considers these challenges. The first part of the paper develops a local approximation technique for computing geodesics and related mappings on Riemannian manifolds in a computationally feasible manner. The second part constructs a smooth manifold and a Riemannian structure that is based on an energy landscape for protein conformations. The resulting Riemannian geometry is tested on several data analysis tasks relevant for protein dynamics data. In particular, the geodesics with given start- and end-points approximately recover corresponding molecular dynamics trajectories for proteins that undergo relatively ordered transitions with medium-sized deformations. The Riemannian protein geometry also gives physically realistic summary statistics and retrieves the underlying dimension even for large-sized deformations within seconds on a laptop.
    Keywords:  Riemannian manifold; dimension reduction; interpolation; manifold-valued data; protein dynamics
    DOI:  https://doi.org/10.1073/pnas.2318951121
  3. Bioinform Adv. 2024 ;4(1): vbae111
       Motivation: Volumetric 3D object analyses are being applied in research fields such as structural bioinformatics, biophysics, and structural biology, with potential integration of artificial intelligence/machine learning (AI/ML) techniques. One such method, 3D Zernike moments, has proven valuable in analyzing protein structures (e.g., protein fold classification, protein-protein interaction analysis, and molecular dynamics simulations). Their compactness and efficiency make them amenable to large-scale analyses. Established methods for deriving 3D Zernike moments, however, can be inefficient, particularly when higher order terms are required, hindering broader applications. As the volume of experimental and computationally-predicted protein structure information continues to increase, structural biology has become a "big data" science requiring more efficient analysis tools.
    Results: This application note presents a Python-based software package, ZMPY3D, to accelerate computation of 3D Zernike moments by vectorizing the mathematical formulae and using graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow together with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithm development. The ZMPY3D package can be installed via PyPI, and the source code is available from GitHub. Volumetric-based protein 3D structural similarity scores and transform matrix of superposition functionalities have both been implemented, creating a powerful computational tool that will allow the research community to amalgamate 3D Zernike moments with existing AI/ML tools, to advance research and education in protein structure bioinformatics.
    Availability and implementation: ZMPY3D, implemented in Python, is available on GitHub (https://github.com/tawssie/ZMPY3D) and PyPI, released under the GPL License.
    DOI:  https://doi.org/10.1093/bioadv/vbae111
  4. Commun Biol. 2024 Aug 08. 7(1): 956
      Human RAD52 (RAD52) is a DNA-binding protein involved in many DNA repair mechanisms and genomic stability maintenance. In the last few years, this protein was discovered to be a promising novel pharmacological target for anticancer strategies. Although the interest in RAD52 has exponentially grown in the previous decade, most information about its structure and mechanism still needs to be elucidated. Here, we report the 2.2 Å resolution cryo-EM reconstruction of the full-length RAD52 (FL-RAD52) protein. This allows us to describe the hydration shell of the N-terminal region of FL-RAD52, which is structured in an undecamer ring. Water molecules coordinate with protein residues to promote stabilization inside and among the protomers and within the inner DNA binding cleft to drive protein-DNA recognition. Additionally, through a multidisciplinary approach involving SEC-SAXS and computational methods, we comprehensively describe the highly flexible and dynamic organization of the C-terminal portion of FL-RAD52. This work discloses unprecedented structural details on the FL-RAD52, which will be critical for characterizing its mechanism of action and inhibitor development, particularly in the context of novel approaches to synthetic lethality and anticancer drug discovery.
    DOI:  https://doi.org/10.1038/s42003-024-06644-1