bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2023–08–06
four papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Cardiovasc Res. 2023 Jul 30. pii: cvad112. [Epub ahead of print]
      Long non-coding RNAs (lncRNAs), which are RNA transcripts exceeding 200 nucleotides, were believed to lack any protein-coding capacity. But advancements in -omics technology have revealed that some lncRNAs have small open reading frames (sORFs) that can be translated by ribosomes to encode peptides, some of which have important biological functions. These encoded peptides subserve important biological functions by interacting with their targets to modulate transcriptional or signaling axes, thereby enhancing or suppressing CVD occurrence and progression. In this review, we summarize what is known about the research strategy of lncRNA-encoded peptides, mainly comprising predictive websites/tools and experimental methods that have been widely used for prediction, identification, and validation. More importantly, we have compiled a list of lncRNA- encoded peptides, with a focus on those that play significant roles in cardiovascular physiology and pathology, including RNO-sORF6/RNO-sORF7/RNO-sORF8, DOWRF, and NLN etc. Additionally, we have outlined the functions and mechanisms of these peptides in cardiovascular physiology and pathology, such as cardiomyocyte hypertrophy, myocardial contraction, myocardial infarction and vascular remodeling. Finally, an overview of the existing challenges and potential future developments in the realm of lncRNA-encoded peptides was provided, with consideration given to prospective avenues for further research. Given that many lncRNA encoded peptides have not been functionally annotated yet, their application in CVD diagnosis and treatment still requires further research.
    Keywords:  Cardiovascular disease; Encoded peptide; Long noncoding RNA; Open reading frame; Physiology and pathology
    DOI:  https://doi.org/10.1093/cvr/cvad112
  2. Sci Rep. 2023 Aug 03. 13(1): 12591
      Moonlighting genes encode for single polypeptide molecules that perform multiple and often unrelated functions. These genes occur across all domains of life. Their ubiquity and functional diversity raise many questions as to their origins, evolution, and role in the cell cycle. In this study, we present a simple bioinformatics probe that allows us to rank genes by antisense translation potential, and we show that this probe enriches, reliably, for moonlighting genes across a variety of organisms. We find that moonlighting genes harbor putative antisense open reading frames (ORFs) rich in codons for non-polar amino acids. We also find that moonlighting genes tend to co-locate with genes involved in cell wall, cell membrane, or cell envelope production. On the basis of this and other findings, we offer a model in which we propose that moonlighting gene products are likely to escape the cell through gaps in the cell wall and membrane, at wall/membrane construction sites; and we propose that antisense ORFs produce "membrane-sticky" protein products, effectively binding moonlighting-gene DNA to the cell membrane in porous areas where intensive cell-wall/cell-membrane construction is underway. This leads to high potential for escape of moonlighting proteins to the cell surface. Evolutionary and other implications of these findings are discussed.
    DOI:  https://doi.org/10.1038/s41598-023-39869-x
  3. Methods Mol Biol. 2023 ;2686 509-536
      Understanding the global and dynamic nature of plant developmental processes requires not only the study of the transcriptome, but also of the proteome, including its largely uncharacterized peptidome fraction. Recent advances in proteomics and high-throughput analyses of translating RNAs (ribosome profiling) have begun to address this issue, evidencing the existence of novel, uncharacterized, and possibly functional peptides. To validate the accumulation in tissues of sORF-encoded polypeptides (SEPs), the basic setup of proteomic analyses (i.e., LC-MS/MS) can be followed. However, the detection of peptides that are small (up to ~100 aa, 6-7 kDa) and novel (i.e., not annotated in reference databases) presents specific challenges that need to be addressed both experimentally and with computational biology resources. Several methods have been developed in recent years to isolate and identify peptides from plant tissues. In this chapter, we outline two different peptide extraction protocols and the subsequent peptide identification by mass spectrometry using the database search or the de novo identification methods.
    Keywords:  Ammonium sulphate; Arabidopsis; C-18; Database; Mass spectrometry; Peptidome; Reverse-phase chromatography; Ultrafiltration
    DOI:  https://doi.org/10.1007/978-1-0716-3299-4_24
  4. Anal Chem. 2023 Aug 03.
      Small proteins of around 50 aa in length have been largely overlooked in genetic and biochemical assays due to the inherent challenges with detecting and characterizing them. Recent discoveries of their critical roles in many biological processes have led to an increased recognition of the importance of small proteins for basic research and as potential new drug targets. One example is CcoM, a 36 aa subunit of the cbb3-type oxidase that plays an essential role in adaptation to oxygen-limited conditions in Pseudomonas stutzeri (P. stutzeri), a model for the clinically relevant, opportunistic pathogen Pseudomonas aeruginosa. However, as no comprehensive data were available in P. stutzeri, we devised an integrated, generic approach to study small proteins more systematically. Using the first complete genome as basis, we conducted bottom-up proteomics analyses and established a digest-free, direct-sequencing proteomics approach to study cells grown under aerobic and oxygen-limiting conditions. Finally, we also applied a proteogenomics pipeline to identify missed protein-coding genes. Overall, we identified 2921 known and 29 novel proteins, many of which were differentially regulated. Among 176 small proteins 16 were novel. Direct sequencing, featuring a specialized precursor acquisition scheme, exhibited advantages in the detection of small proteins with higher (up to 100%) sequence coverage and more spectral counts, including sequences with high proline content. Three novel small proteins, uniquely identified by direct sequencing and not conserved beyond P. stutzeri, were predicted to form an operon with a conserved protein and may represent de novo genes. These data demonstrate the power of this combined approach to study small proteins in P. stutzeri and show its potential for other prokaryotes.
    DOI:  https://doi.org/10.1021/acs.analchem.3c00676