bims-strubi Biomed News
on Advances in structural biology
Issue of 2021‒09‒05
twenty-nine papers selected by
Alessandro Grinzato
European Synchrotron Radiation Facility

  1. Front Mol Biosci. 2021 ;8 724947
      Protein-protein docking is a useful tool for modeling the structures of protein complexes that have yet to be experimentally determined. Understanding the structures of protein complexes is a key component for formulating hypotheses in biophysics regarding the functional mechanisms of complexes. Protein-protein docking is an established technique for cases where the structures of the subunits have been determined. While the number of known structures deposited in the Protein Data Bank is increasing, there are still many cases where the structures of individual proteins that users want to dock are not determined yet. Here, we have integrated the AttentiveDist method for protein structure prediction into our LZerD webserver for protein-protein docking, which enables users to simply submit protein sequences and obtain full-complex atomic models, without having to supply any structure themselves. We have further extended the LZerD docking interface with a symmetrical homodimer mode. The LZerD server is available at
    Keywords:  LZerD; protein bioinformatics; protein structure prediction; protein-protein docking; structure modeling; symmetrical docking; web server
  2. J Struct Biol. 2021 Aug 25. pii: S1047-8477(21)00088-5. [Epub ahead of print]213(4): 107783
      The air-water interface (AWI) tends to adsorb proteins and frequently causes preferred orientation problems in cryo-electron microscopy (cryo-EM). Here, we examined cryo-EM data from protein samples frozen with different detergents and found that both anionic and cationic detergents promoted binding of proteins to the AWI. By contrast, some of the nonionic and zwitterionic detergents tended to prevent proteins from attaching to the AWI. The protein orientation distributions with different anionic detergents were similar and resembled that obtained without detergent. By contrast, cationic detergents gave distinct orientation distributions. Our results indicate that proteins adsorb to charged interface and the negative charge of the AWI plays an important role in adsorbing proteins in the conventional cryo-EM sample preparation. According to these findings, a new method was developed by adding anionic detergent at a concentration between 0.002% and 0.005%. Using this method, the protein particles exhibited a more evenly distributed orientations and still adsorbed to the AWI enabling them embedding in a thin layer of ice with high concentration, which will benefit the cryo-EM structural determination.
    Keywords:  Air-water interface; Cryo-EM sample preparation; Preferred orientation; Surface charge
  3. Chem Rev. 2021 Sep 01.
      Native mass spectrometry (MS) is aimed at preserving and determining the native structure, composition, and stoichiometry of biomolecules and their complexes from solution after they are transferred into the gas phase. Major improvements in native MS instrumentation and experimental methods over the past few decades have led to a concomitant increase in the complexity and heterogeneity of samples that can be analyzed, including protein-ligand complexes, protein complexes with multiple coexisting stoichiometries, and membrane protein-lipid assemblies. Heterogeneous features of these biomolecular samples can be important for understanding structure and function. However, sample heterogeneity can make assignment of ion mass, charge, composition, and structure very challenging due to the overlap of tens or even hundreds of peaks in the mass spectrum. In this review, we cover data analysis, experimental, and instrumental advances and strategies aimed at solving this problem, with an in-depth discussion of theoretical and practical aspects of the use of available deconvolution algorithms and tools. We also reflect upon current challenges and provide a view of the future of this exciting field.
  4. iScience. 2021 Sep 24. 24(9): 102959
      Cryo-electron tomography has stepped fully into the spotlight. Enthusiasm is high. Fortunately for us, this is an exciting time to be a cryotomographer, but there is still a way to go before declaring victory. Despite its potential, cryo-electron tomography possesses many inherent challenges. How do we image through thick cell samples, and possibly even tissue? How do we identify a protein of interest amidst the noisy, crowded environment of the cytoplasm? How do we target specific moments of a dynamic cellular process for tomographic imaging? In this review, we cover the history of cryo-electron tomography and how it came to be, roughly speaking, as well as the many approaches that have been developed to overcome its intrinsic limitations.
    Keywords:  Biochemistry; Cell biology; Structural biology
  5. Front Mol Biosci. 2021 ;8 676268
      Paramagnetic nuclear magnetic resonance (NMR) methods have emerged as powerful tools for structure determination of large, sparsely protonated proteins. However traditional applications face several challenges, including a need for large datasets to offset the sparsity of restraints, the difficulty in accounting for the conformational heterogeneity of the spin-label, and noisy experimental data. Here we propose an integrative approach to structure determination combining sparse paramagnetic NMR with physical modelling to infer approximate protein structural ensembles. We use calmodulin in complex with the smooth muscle myosin light chain kinase peptide as a model system. Despite acquiring data from samples labeled only at the backbone amide positions, we are able to produce an ensemble with an average RMSD of ∼2.8 Å from a reference X-ray crystal structure. Our approach requires only backbone chemical shifts and measurements of the paramagnetic relaxation enhancement and residual dipolar couplings that can be obtained from sparsely labeled samples.
    Keywords:  NMR; calmodulin; integrative structural biology; modeling; paramagnetic relaxation enhancement; protein structure
  6. Curr Opin Struct Biol. 2021 Aug 26. pii: S0959-440X(21)00109-3. [Epub ahead of print]71 232-238
      An estimated half of all proteins contain a metal, with these being essential for a tremendous variety of biological functions. X-ray crystallography is the major method for obtaining structures at high resolution of these metalloproteins, but there are considerable challenges to obtain intact structures due to the effects of radiation damage. Serial crystallography offers the prospect of determining low-dose synchrotron or effectively damage free XFEL structures at room temperature and enables time-resolved or dose-resolved approaches. Complementary spectroscopic data can validate redox and or ligand states within metalloprotein crystals. In this opinion, we discuss developments in the application of serial crystallographic approaches to metalloproteins and comment on future directions.
  7. Acta Crystallogr A Found Adv. 2021 Sep 01. 77(Pt 5): 472-479
      The power spectrum of proteins at high frequencies is remarkably well described by the flat Wilson statistics. Wilson statistics therefore plays a significant role in X-ray crystallography and more recently in electron cryomicroscopy (cryo-EM). Specifically, modern computational methods for three-dimensional map sharpening and atomic modelling of macromolecules by single-particle cryo-EM are based on Wilson statistics. Here the first rigorous mathematical derivation of Wilson statistics is provided. The derivation pinpoints the regime of validity of Wilson statistics in terms of the size of the macromolecule. Moreover, the analysis naturally leads to generalizations of the statistics to covariance and higher-order spectra. These in turn provide a theoretical foundation for assumptions underlying the widespread Bayesian inference framework for three-dimensional refinement and for explaining the limitations of autocorrelation-based methods in cryo-EM.
    Keywords:  Fourier analysis; Guinier plot; Wilson statistics; cryo-EM; power spectrum
  8. Acta Crystallogr D Struct Biol. 2021 Sep 01. 77(Pt 9): 1153-1167
      Serial data collection has emerged as a major tool for data collection at state-of-the-art light sources, such as microfocus beamlines at synchrotrons and X-ray free-electron lasers. Challenging targets, characterized by small crystal sizes, weak diffraction and stringent dose limits, benefit most from these methods. Here, the use of a thin support made of a polymer-based membrane for performing serial data collection or screening experiments is demonstrated. It is shown that these supports are suitable for a wide range of protein crystals suspended in liquids. The supports have also proved to be applicable to challenging cases such as membrane proteins growing in the sponge phase. The sample-deposition method is simple and robust, as well as flexible and adaptable to a variety of cases. It results in an optimally thin specimen providing low background while maintaining minute amounts of mother liquor around the crystals. The 2 × 2 mm area enables the deposition of up to several microlitres of liquid. Imaging and visualization of the crystals are straightforward on the highly transparent membrane. Thanks to their affordable fabrication, these supports have the potential to become an attractive option for serial experiments at synchrotrons and free-electron lasers.
    Keywords:  SFX; fixed target; sample supports; serial crystallography
  9. Proteins. 2021 Aug 29.
      The high accuracy of some CASP14 models at the domain level prompted a more detailed evaluation of structure predictions on whole targets. For the first time in CASP, we evaluated accuracy of difficult domain assembly in models submitted for multidomain targets where the community predicted individual evaluation units with greater accuracy than full-length targets. Ten proteins with domain interactions that did not show evidence of conformational change and were not involved in significant oligomeric contacts were chosen as targets for the domain interaction assessment. Groups were ranked using complementary interaction scores (F1, QS-score and Jaccard coefficient) and their predictions were evaluated for their ability to correctly model inter-domain interfaces and overall protein folds. Target performance was broadly grouped into two clusters. The first consisted primarily of targets containing two evaluation units (EU) wherein predictors more broadly predicted domain positioning and interfacial contacts correctly. The other consisted of complex two- and three-EU targets where few predictors performed well. The highest ranked predictor, AlphaFold2, produced high-accuracy models on eight out of ten targets. Their interdomain scores on three of these targets were significantly higher than all other groups and were responsible for their overall outperformance in the category. We further highlight the performance of AlphaFold2 and the next best group, BAKER-experimental on several interesting targets. This article is protected by copyright. All rights reserved.
    Keywords:  CASP14; classification; fold space; protein domains; protein structure; protein-protein interactions; sequence homologs multidomain proteins; structure prediction
  10. Acta Crystallogr D Struct Biol. 2021 Sep 01. 77(Pt 9): 1142-1152
      The process of turning 2D micrographs into 3D atomic models of the imaged macromolecules has been under rapid development and scrutiny in the field of cryo-EM. Here, some important methods for validation at several stages in this process are described. Firstly, how Fourier shell correlation of two independent maps and phase randomization beyond a certain frequency address the assessment of map resolution is reviewed. Techniques for local resolution estimation and map sharpening are also touched upon. The topic of validating models which are either built de novo or based on a known atomic structure fitted into a cryo-EM map is then approached. Map-model comparison using Q-scores and Fourier shell correlation plots is used to assure the agreement of the model with the observed map density. The importance of annotating the model with B factors to account for the resolvability of individual atoms in the map is illustrated. Finally, the timely topic of detecting and validating water molecules and metal ions in maps that have surpassed ∼2 Å resolution is described.
    Keywords:  Fourier shell correlation; annotation; cryo-EM; modeling; validation
  11. J Synchrotron Radiat. 2021 Sep 01. 28(Pt 5): 1343-1356
      Imaging of biomolecules by ionizing radiation, such as electrons, causes radiation damage which introduces structural and compositional changes of the specimen. The total number of high-energy electrons per surface area that can be used for imaging in cryogenic electron microscopy (cryo-EM) is severely restricted due to radiation damage, resulting in low signal-to-noise ratios (SNR). High resolution details are dampened by the transfer function of the microscope and detector, and are the first to be lost as radiation damage alters the individual molecules which are presumed to be identical during averaging. As a consequence, radiation damage puts a limit on the particle size and sample heterogeneity with which electron microscopy (EM) can deal. Since a transmission EM (TEM) image is formed from the scattering process of the electron by the specimen interaction potential, radiation damage is inevitable. However, we can aim to maximize the information transfer for a given dose and increase the SNR by finding alternatives to the conventional phase-contrast cryo-EM techniques. Here some alternative transmission electron microscopy techniques are reviewed, including phase plate, multi-pass transmission electron microscopy, off-axis holography, ptychography and a quantum sorter. Their prospects for providing more or complementary structural information within the limited lifetime of the sample are discussed.
    Keywords:  holography; multi-pass TEM; phase plate; ptychography; quantum sorter
  12. Chem Sci. 2021 Aug 18. 12(32): 10836-10847
      Electrophilic peptides that form an irreversible covalent bond with their target have great potential for binding targets that have been previously considered undruggable. However, the discovery of such peptides remains a challenge. Here, we present Rosetta CovPepDock, a computational pipeline for peptide docking that incorporates covalent binding between the peptide and a receptor cysteine. We applied CovPepDock retrospectively to a dataset of 115 disulfide-bound peptides and a dataset of 54 electrophilic peptides. It produced a top-five scoring, near-native model, in 89% and 100% of the cases when docking from the native conformation, and 20% and 90% when docking from an extended peptide conformation, respectively. In addition, we developed a protocol for designing electrophilic peptide binders based on known non-covalent binders or protein-protein interfaces. We identified 7154 peptide candidates in the PDB for application of this protocol. As a proof-of-concept we validated the protocol on the non-covalent complex of 14-3-3σ and YAP1 phosphopeptide. The protocol identified seven highly potent and selective irreversible peptide binders. The predicted binding mode of one of the peptides was validated using X-ray crystallography. This case-study demonstrates the utility and impact of CovPepDock. It suggests that many new electrophilic peptide binders can be rapidly discovered, with significant potential as therapeutic molecules and chemical probes.
  13. Acta Crystallogr D Struct Biol. 2021 Sep 01. 77(Pt 9): 1116-1126
      Biochemical and biophysical experiments are essential for uncovering the three-dimensional structure and biological role of a protein of interest. However, meaningful predictions can frequently also be made using bioinformatics resources that transfer knowledge from a well studied protein to an uncharacterized protein based on their evolutionary relatedness. These predictions are helpful in developing specific hypotheses to guide wet-laboratory experiments. Commonly used bioinformatics resources include methods to identify and predict conserved sequence motifs, protein domains, transmembrane segments, signal sequences, and secondary as well as tertiary structure. Here, several such methods available through the MPI Bioinformatics Toolkit ( are described and how their combined use can provide meaningful information on a protein of unknown function is demonstrated. In particular, the identification of homologs of known structure using HHpred, internal repeats using HHrepID, coiled coils using PCOILS and DeepCoil, and transmembrane segments using Quick2D are focused on.
    Keywords:  bioinformatics tools; coiled coils; deep homology searches; homology modeling; protein sequence annotation; sequence features
  14. J Struct Biol. 2021 Aug 29. pii: S1047-8477(21)00085-X. [Epub ahead of print]213(4): 107780
      Electron cryomicroscopy (cryo-EM) has emerged as a powerful structural biology instrument to solve near-atomic three-dimensional structures. Despite the fast growth in the number of density maps generated from cryo-EM data, comparison tools among these reconstructions are still lacking. Current proposals to compare cryo-EM data derived volumes perform map subtraction based on adjustment of each volume grey level to the same scale. We present here a more sophisticated way of adjusting the volumes before comparing, which implies adjustment of grey level scale and spectrum energy, but keeping phases intact inside a mask and imposing the results to be strictly positive. The adjustment that we propose leaves the volumes in the same numeric frame, allowing to perform operations among the adjusted volumes in a more reliable way. This adjustment can be a preliminary step for several applications such as comparison through subtraction, map sharpening, or combination of volumes through a consensus that selects the best resolved parts of each input map. Our development might also be used as a sharpening method using an atomic model as a reference. We illustrate the applicability of this algorithm with the reconstructions derived of several experimental examples. This algorithm is implemented in Xmipp software package and its applications are user-friendly accessible through the cryo-EM image processing framework Scipion.
    Keywords:  Cryo-EM; Map fusion; SPA; Sharpening; Subtomogram averaging; Subtraction
  15. Protein Sci. 2021 Aug 30.
      Detergent-soluble proteins (DSPs) are commonly dissolved in lipid buffers for NMR experiments, but the huge lipid proton signal prevents recording of high-quality spectra. The use of costly deuterated lipids is thus required to replace non-deuterated ones. With conventional methods, detergents like dodecylphosphocholine (DPC) cannot be fully exchanged due to their high binding affinity to hydrophobic proteins. We propose an original and simple protocol which combines the use of acetonitrile, dialysis and lyophilization to disrupt the binding of lipids to the protein and allow their indirect replacement by their deuterated equivalents, while maintaining the native structure of the protein. Moreover, by this protocol, the detergent-to-protein molar ratio can be controlled as it challenges the protein structure. This protocol was applied to solubilize the Vpx protein that was followed upon addition of DPC-d38 by 1 H-15 N SOFAST-HMQC spectra and the best detergent-to-DSPs molar ratio was obtained for structural studies. This article is protected by copyright. All rights reserved.
    Keywords:  3D structure; DPC-to-protein molar ratio; NMR; acetonitrile; detergent-soluble proteins
  16. Acta Crystallogr D Struct Biol. 2021 Sep 01. 77(Pt 9): 1168-1182
      In recent years, crystallographic fragment screening has matured into an almost routine experiment at several modern synchrotron sites. The hits of the screening experiment, i.e. small molecules or fragments binding to the target protein, are revealed along with their 3D structural information. Therefore, they can serve as useful starting points for further structure-based hit-to-lead development. However, the progression of fragment hits to tool compounds or even leads is often hampered by a lack of chemical feasibility. As an attractive alternative, compound analogs that embed the fragment hit structurally may be obtained from commercial catalogs. Here, a workflow is reported based on filtering and assessing such potential follow-up compounds by template docking. This means that the crystallographic binding pose was integrated into the docking calculations as a central starting parameter. Subsequently, the candidates are scored on their interactions within the binding pocket. In an initial proof-of-concept study using five starting fragments known to bind to the aspartic protease endothiapepsin, 28 follow-up compounds were selected using the designed workflow and their binding was assessed by crystallography. Ten of these compounds bound to the active site and five of them showed significantly increased affinity in isothermal titration calorimetry of up to single-digit micromolar affinity. Taken together, this strategy is capable of efficiently evolving the initial fragment hits without major synthesis efforts and with full control by X-ray crystallography.
    Keywords:  crystallographic fragment screening; fragment-based lead discovery; pose validation; structure-based drug design; template docking
  17. J Synchrotron Radiat. 2021 Sep 01. 28(Pt 5): 1309-1320
      X-ray-based techniques are a powerful tool in structural biology but the radiation-induced chemistry that results can be detrimental and may mask an accurate structural understanding. In the crystallographic case, cryocooling has been employed as a successful mitigation strategy but also has its limitations including the trapping of non-biological structural states. Crystallographic and solution studies performed at physiological temperatures can reveal otherwise hidden but relevant conformations, but are limited by their increased susceptibility to radiation damage. In this case, chemical additives that scavenge the species generated by radiation can mitigate damage but are not always successful and the mechanisms are often unclear. Using a protein designed to undergo a large-scale structural change from breakage of a disulfide bond, radiation damage can be monitored with small-angle X-ray scattering. Using this, we have quantitatively evaluated how three scavengers commonly used in crystallographic experiments - sodium nitrate, cysteine, and ascorbic acid - perform in solution at 10°C. Sodium nitrate was the most effective scavenger and completely inhibited fragmentation of the disulfide bond at a lower concentration (500 µM) compared with cysteine (∼5 mM) while ascorbic acid performed best at 5 mM but could only reduce fragmentation by ∼75% after a total accumulated dose of 792 Gy. The relative effectiveness of each scavenger matches their reported affinities for solvated electrons. Saturating concentrations of each scavenger shifted fragmentation from first order to a zeroth-order process, perhaps indicating the direct contribution of photoabsorption. The SAXS-based method can detect damage at X-ray doses far lower than those accessible crystallographically, thereby providing a detailed picture of scavenger processes. The solution results are also in close agreement with what is known about scavenger performance and mechanism in a crystallographic setting and suggest that a link can be made between the damage phenomenon in the two scenarios. Therefore, our engineered approach might provide a platform for more systematic and comprehensive screening of radioprotectants that can directly inform mitigation strategies for both solution and crystallographic experiments, while also clarifying fundamental radiation damage mechanisms.
    Keywords:  SAXS; disulfide bond; protein structure; radiation damage; scavengers
  18. J Synchrotron Radiat. 2021 Sep 01. 28(Pt 5): 1278-1283
      An understanding of radiation damage effects suffered by biological samples during structural analysis using both X-rays and electrons is pivotal to obtain reliable molecular models of imaged molecules. This special issue on radiation damage contains six papers reporting analyses of damage from a range of biophysical imaging techniques. For X-ray diffraction, an in-depth study of multi-crystal small-wedge data collection single-wavelength anomalous dispersion phasing protocols is presented, concluding that an absorbed dose of 5 MGy per crystal was optimal to allow reliable phasing. For small-angle X-ray scattering, experiments are reported that evaluate the efficacy of three radical scavengers using a protein designed to give a clear signature of damage in the form of a large conformational change upon the breakage of a disulfide bond. The use of X-rays to induce OH radicals from the radiolysis of water for X-ray footprinting are covered in two papers. In the first, new developments and the data collection pipeline at the NSLS-II high-throughput dedicated synchrotron beamline are described, and, in the second, the X-ray induced changes in three different proteins under aerobic and low-oxygen conditions are investigated and correlated with the absorbed dose. Studies in XFEL science are represented by a report on simulations of ultrafast dynamics in protic ionic liquids, and, lastly, a broad coverage of possible methods for dose efficiency improvement in modalities using electrons is presented. These papers, as well as a brief synopsis of some other relevant literature published since the last Journal of Synchrotron Radiation Special Issue on Radiation Damage in 2019, are summarized below.
    Keywords:  SAXS; X-ray footprinting; X-ray imaging; XFEL simulations; dose; electron microscopy; macromolecular crystallography; radiation damage; room-temperature crystallography; serial crystallography; single-wavelength anomalous dispersion
  19. Protein Sci. 2021 Sep 01.
      The prediction of the three-dimensional structure of proteins from the amino acid sequence made a stunning breakthrough reaching atomic accuracy. Using the neural network-based method AlphaFold2 three-dimensional structures of almost the entire human proteome have been predicted and made available ( To gain insight into how well AlphaFold2 structures represent the conformation of proteins in solution, I here compare the AlphaFold2 structures of selected small proteins with their 3D structures that were determined by NMR spectroscopy. Proteins were selected for which the 3D solution structures were determined on the basis of a very large number of distance restraints and residual dipolar couplings and are thus some of the best-resolved solution structures of proteins to date. The quality of the backbone conformation of the AlphaFold2 structures is assessed by fitting a large set of experimental residual dipolar couplings (RDCs). The analysis shows that experimental RDCs fit extremely well to the AlphaFold2 structures predicted for GB3, DinI and ubiquitin. In the case of GB3, the accuracy of the AlphaFold2 structure even surpasses that of a 1.1 å crystal structure. Fitting of experimental RDCs furthermore allows identification of AlphaFold2 structures that are best representative of the protein's conformation in solution as seen for the EF hands of the N-terminal domain of Ca2+ -ligated calmodulin. Taken together the analysis shows that structures predicted by AlphaFold2 can be highly representative of the solution conformation of proteins. The combination of AlphaFold2 structures with RDCs promises to be a powerful approach to study structural changes in proteins. This article is protected by copyright. All rights reserved.
    Keywords:  AlphaFold; NMR spectroscopy; conformational dynamics; dipolar coupling
  20. Proteins. 2021 Sep 02.
      In this paper, we report our tfold framework's performance on the inter-residue contact prediction task in the 14th Critical Assessment of protein Structure Prediction (CASP14). Our tfold framework seamlessly combines both homologous sequences and structural decoys under an ultra-deep network architecture. Squeeze-excitation and axial attention mechanisms are employed to effectively capture inter-residue interactions. In CASP14, our best predictor achieves 41.78\% in the averaged top-L precision for long-range contacts for all the 22 free-modeling (FM) targets, and ranked 1st among all the 60 participating teams. The tFold web server is now freely available at: This article is protected by copyright. All rights reserved.
    Keywords:  CASP14; contact prediction; deep convolutional residual neural network; protein folding
  21. RSC Chem Biol. 2021 Feb 01. 2(1): 259-265
      Biochemical signaling is mediated by complexes between macromolecular receptors and their ligands, with the duration of the signal being directly related to the lifetime of the ligand-receptor complex. In the field of drug design, the recognition that drug efficacy in vivo depends on the lifetime of the drug-protein complex has spawned the concept of designing drugs with particular binding kinetics. To advance this field it is critical to investigate how the molecular details of designed ligands might affect the binding kinetics, as well as the equilibrium binding constant. Here we use protein NMR relaxation dispersion to determine linear free energy relationships involving the on- and off-rates and the affinity for a series of congeneric ligands targeting the carbohydrate recognition domain of galectin-3. Using this approach we determine the energy landscape and the position of the transition state along the reaction coordinate of protein-ligand binding. The results show that ligands exhibiting reduced off-rates achieve this by primarily stabilizing the bound state, but do not affect the transition state to any greater extent. The transition state forms early, that is, it is located significantly closer to the free state than to the bound state, suggesting a critical role of desolvation. Furthermore, the data suggest that different subclasses of ligands show different behavior with respect to these characteristics.
  22. J Chem Inf Model. 2021 Sep 01.
      Nucleic acid-ligand interactions play an important role in numerous cellular processes such as gene function expression and regulation. Therefore, nucleic acids such as RNAs have become more and more important drug targets, where the structural determination of nucleic acid-ligand complexes is pivotal for understanding their functions and thus developing therapeutic interventions. Molecular docking has been a useful computational tool in predicting the complex structure between molecules. However, although a number of docking algorithms have been developed for protein-ligand interactions, only a few docking programs were presented for nucleic acid-ligand interactions. Here, we have developed a fast nucleic acid-ligand docking algorithm, named NLDock, by implementing our intrinsic scoring function ITScoreNL for nucleic acid-ligand interactions into a modified version of the MDock program. NLDock was extensively evaluated on four test sets and compared with five other state-of-the-art docking algorithms including AutoDock, DOCK 6, rDock, GOLD, and Glide. It was shown that our NLDock algorithm obtained a significantly better performance than the other docking programs in binding mode predictions and achieved the success rates of 73%, 36%, and 32% on the largest test set of 77 complexes for local rigid-, local flexible-, and global flexible-ligand docking, respectively. In addition, our NLDock approach is also computationally efficient and consumed an average of as short as 0.97 and 2.08 min for a local flexible-ligand docking job and a global flexible-ligand docking job, respectively. These results suggest the good performance of our NLDock in both docking accuracy and computational efficiency.
  23. J Chem Theory Comput. 2021 Aug 31.
      Molecular dynamics simulations are widely used to determine equilibrium and dynamic properties of proteins. Nearly all simulations, currently, are carried out at constant temperature, with a Langevin thermostat among the most widely used. Thermostats distort protein dynamics, but whether or how such distortions can be corrected has long been an open question. Here, we show that constant-temperature simulations with a Langevin thermostat dilate protein dynamics and present a correction scheme to remove the dynamic distortions. Specifically, ns-scale time constants for overall rotation are dilated significantly but sub-ns time constants for internal motions are dilated modestly, while all motional amplitudes are unaffected. The correction scheme involves contraction of the time constants, with the contraction factor a linear function of the time constant to be corrected. The corrected dynamics of eight proteins are validated by NMR data for rotational diffusion and for backbone amide and side-chain methyl relaxation. The present work demonstrates that even for complex systems like proteins with dynamics spanning multiple timescales, one can predict how thermostats distort protein dynamics and remove such distortions. The correction scheme will have wide applications, facilitating force-field parameterization and propelling simulations to be on par with NMR and other experimental techniques in determining dynamic properties of proteins.
  24. Bioinformatics. 2021 Sep 02. pii: btab632. [Epub ahead of print]
      MOTIVATION: Protein model quality assessment (QA) is an essential component in protein structure prediction, which aims to estimate the quality of a structure model and/or select the most accurate model out from a pool of structure models, without knowing the native structure. QA remains a challenging task in protein structure prediction.RESULTS: Based on the inter-residue distance predicted by the recent deep learning-based structure prediction algorithm trRosetta, we developed QDistance, a new approach to the estimation of both global and local qualities. QDistance works for both single-model and multi-models inputs. We designed several distance-based features to assess the agreement between the predicted and model-derived inter-residue distances. Together with a few widely used features, they are fed into a simple yet powerful linear regression model to infer the global QA scores. The local QA scores for each structure model are predicted based on a comparative analysis with a set of selected reference models. For multi-models input, the reference models are selected from the input based on the predicted global QA scores. For single-model input, the reference models are predicted by trRosetta. With the informative distance-based features, QDistance can predict the global quality with satisfactory accuracy. Benchmark tests on the CASP13 and the CAMEO structure models suggested that QDistance was competitive other methods. Blind tests in the CASP14 experiments showed that QDistance was robust and ranked among the top predictors. Especially, QDistance was the top 3 local QA method and made the most accurate local QA prediction for unreliable local region. Analysis showed that this superior performance can be attributed to the inclusion of the predicted inter-residue distance.
    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
  25. J Synchrotron Radiat. 2021 Sep 01. 28(Pt 5): 1321-1332
      Synchrotron X-ray footprinting (XF) is a growing structural biology technique that leverages radiation-induced chemical modifications via X-ray radiolysis of water to produce hydroxyl radicals that probe changes in macromolecular structure and dynamics in solution states of interest. The X-ray Footprinting of Biological Materials (XFP) beamline at the National Synchrotron Light Source II provides the structural biology community with access to instrumentation and expert support in the XF method, and is also a platform for development of new technological capabilities in this field. The design and implementation of a new high-throughput endstation device based around use of a 96-well PCR plate form factor and supporting diagnostic instrumentation for synchrotron XF is described. This development enables a pipeline for rapid comprehensive screening of the influence of sample chemistry on hydroxyl radical dose using a convenient fluorescent assay, illustrated here with a study of 26 organic compounds. The new high-throughput endstation device and sample evaluation pipeline now available at the XFP beamline provide the worldwide structural biology community with a robust resource for carrying out well optimized synchrotron XF studies of challenging biological systems with complex sample compositions.
    Keywords:  Bluesky; X-ray radiolysis; high throughput; hydroxyl radical footprinting; macromolecular dynamics; nucleic acids; protein structure
  26. J Chem Inf Model. 2021 Sep 01.
      Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked rather than crystallographic poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. We also present a new, freely available validation set-the Updated DUD-E Diverse Subset-for binding affinity prediction using data from DUD-E and ChEMBL. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function sometimes generalizes poorly to a protein target not represented in the training set, demonstrating the need for improved scoring functions and additional validation benchmarks.
  27. Curr Opin Struct Biol. 2021 Aug 27. pii: S0959-440X(21)00119-6. [Epub ahead of print]72 46-54
      Physics and physical chemistry are an important thread in computational protein design, complementary to knowledge-based tools. They provide molecular mechanics scoring functions that need little or no ad hoc parameter readjustment, methods to thoroughly sample equilibrium ensembles, and different levels of approximation for conformational flexibility. They led recently to the successful redesign of a small protein using a physics-based folded state energy. Adaptive Monte Carlo or molecular dynamics schemes were discovered where protein variants are populated as per their ligand-binding free energy or catalytic efficiency. Molecular dynamics have been used for backbone flexibility. Implicit solvent models have been refined, polarizable force fields applied, and many physical insights obtained.
  28. Sci Rep. 2021 Sep 02. 11(1): 17619
      Understanding drug-drug interactions is an essential step to reduce the risk of adverse drug events before clinical drug co-prescription. Existing methods, commonly integrating heterogeneous data to increase model performance, often suffer from a high model complexity, As such, how to elucidate the molecular mechanisms underlying drug-drug interactions while preserving rational biological interpretability is a challenging task in computational modeling for drug discovery. In this study, we attempt to investigate drug-drug interactions via the associations between genes that two drugs target. For this purpose, we propose a simple f drug target profile representation to depict drugs and drug pairs, from which an l2-regularized logistic regression model is built to predict drug-drug interactions. Furthermore, we define several statistical metrics in the context of human protein-protein interaction networks and signaling pathways to measure the interaction intensity, interaction efficacy and action range between two drugs. Large-scale empirical studies including both cross validation and independent test show that the proposed drug target profiles-based machine learning framework outperforms existing data integration-based methods. The proposed statistical metrics show that two drugs easily interact in the cases that they target common genes; or their target genes connect via short paths in protein-protein interaction networks; or their target genes are located at signaling pathways that have cross-talks. The unravelled mechanisms could provide biological insights into potential adverse drug reactions of co-prescribed drugs.
  29. J Am Med Inform Assoc. 2021 Sep 02. pii: ocab162. [Epub ahead of print]
      OBJECTIVE: To develop an end-to-end deep learning framework based on a protein-protein interaction (PPI) network to make synergistic anticancer drug combination predictions.MATERIALS AND METHODS: We propose a deep learning framework named Graph Convolutional Network for Drug Synergy (GraphSynergy). GraphSynergy adapts a spatial-based Graph Convolutional Network component to encode the high-order topological relationships in the PPI network of protein modules targeted by a pair of drugs, as well as the protein modules associated with a specific cancer cell line. The pharmacological effects of drug combinations are explicitly evaluated by their therapy and toxicity scores. An attention component is also introduced in GraphSynergy, which aims to capture the pivotal proteins that play a part in both PPI network and biomolecular interactions between drug combinations and cancer cell lines.
    RESULTS: GraphSynergy outperforms the classic and state-of-the-art models in predicting synergistic drug combinations on the 2 latest drug combination datasets. Specifically, GraphSynergy achieves accuracy values of 0.7553 (11.94% improvement compared to DeepSynergy, the latest published drug combination prediction algorithm) and 0.7557 (10.95% improvement compared to DeepSynergy) on DrugCombDB and Oncology-Screen datasets, respectively. Furthermore, the proteins allocated with high contribution weights during the training of GraphSynergy are proved to play a role in view of molecular functions and biological processes, such as transcription and transcription regulation.
    CONCLUSION: The introduction of topological relations between drug combination and cell line within the PPI network can significantly improve the capability of synergistic drug combination identification.
    Keywords:  anticancer; deep learning; drug combination; graph convolutional network; network