bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2020–09–13
three papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. J Proteomics. 2020 Sep 03. pii: S1874-3919(20)30333-X. [Epub ahead of print] 103965
      The small proteins and short open reading frames encoded peptides (SEPs) are of fundamental importance because of their essential roles in biological processes. However, the annotation or identification of them is challenging, in part owing to the limitation of the traditional genome annotation pipeline and their inherent characteristics of low abundance and low molecular weight. To discover and characterize SEPs in Hep3B cell line, we developed an optimized peptidomic assay by combining different peptide extraction and separation methods. The organic solvent precipitation method in peptidomic showed promotion in the enrichment of low molecular proteins or peptides, and the data clearly showed a beneficial effect from the reduction of sample complexity, resulting in high-quality MS/MS spectra. Furthermore, different strategies exhibited good complementarity in improving the total amount of small proteins and their sequence coverage. In total, 1192 proteins within less than 100 amino acids were identified, including 271 newly discovered SEPs that been annotated in the OpenProt database and 147 SEPs of them encoded from ncRNA or lincRNA. Results in this work provide robust evidence to date that the human proteome is more complicated than previously appreciated, and this will be a benefit to discoveries of proteins without function annotation. SIGNIFICANCE: In this work, methods were optimized to identify SEPs in Hep3B. The organic solvent precipitation presents promotion in enrichment of low molecular proteins or peptides, and the data clearly showed a beneficial effect from the reduction of sample complexity, resulting in high quality MS/MS spectra. Different strategies exhibited good complementarity in improving total amount of small proteins and their sequence coverage. In total, 1192 proteins within less than 100 amino acids were identified, including 271 newly discovered SEPs that been annotated in the OpenProt database and 147 SEPs of them encoded from ncRNA or lincRNA. Furthermore, 22 SEPs generated from the uORF may has potential effect in translation control, and 149 newly identified SEPs have known functional domains or cross-species conservation. Results in this work present robust evidence for the coding potential of the ignored region of human genomes and may provide additional insights into tumor biology.
    Keywords:  Acetonitrile precipitation; Hep3B cell line; Peptidomic; SEP enrichment; Short open reading frames; sORF-encoded peptides
    DOI:  https://doi.org/10.1016/j.jprot.2020.103965
  2. Nature. 2020 Sep 09.
      Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongoing Coronavirus disease 19 (COVID-19) pandemic1. In order to understand SARS-CoV-2 pathogenicity and antigenic potential, and to develop therapeutic tools, it is essential to portray the full repertoire of its expressed proteins. The SARS-CoV-2 coding capacity map is currently based on computational predictions and relies on homology to other coronaviruses. Since coronaviruses differ in their protein array, especially in the variety of accessory proteins, it is crucial to characterize the specific collection of SARS-CoV-2 proteins in an unbiased and open-ended manner. Using a suite of ribosome profiling techniques2-4, we present a high-resolution map of the SARS-CoV-2 coding regions, allowing us to accurately quantify the expression of canonical viral open reading frames (ORFs) and to identify 23 unannotated viral ORFs. These ORFs include upstream ORFs (uORFs) that are likely playing a regulatory role, several in-frame internal ORFs lying within existing ORFs, resulting in N-terminally truncated products, as well as internal out-of-frame ORFs, which generate novel polypeptides. We further show that viral mRNAs are not translated more efficiently than host mRNAs; rather, virus translation dominates host translation due to high levels of viral transcripts. Our work provides a rich resource, which will form the basis of future functional studies.
    DOI:  https://doi.org/10.1038/s41586-020-2739-1
  3. Genome Biol. 2020 Sep 07. 21(1): 237
       BACKGROUND: Several long noncoding RNAs (lncRNAs) have been shown to function as components of molecular machines that play fundamental roles in biology. While the number of annotated lncRNAs in mammalian genomes has greatly expanded, studying lncRNA function has been a challenge due to their diverse biological roles and because lncRNA loci can contain multiple molecular modes that may exert function.
    RESULTS: We previously generated and characterized a cohort of 20 lncRNA loci knockout mice. Here, we extend this initial study and provide a more detailed analysis of the highly conserved lncRNA locus, taurine-upregulated gene 1 (Tug1). We report that Tug1-knockout male mice are sterile with underlying defects including a low number of sperm and abnormal sperm morphology. Because lncRNA loci can contain multiple modes of action, we wanted to determine which, if any, potential elements contained in the Tug1 genomic region have any activity. Using engineered mouse models and cell-based assays, we provide evidence that the Tug1 locus harbors two distinct noncoding regulatory activities, as a cis-DNA repressor that regulates neighboring genes and as a lncRNA that can regulate genes by a trans-based function. We also show that Tug1 contains an evolutionary conserved open reading frame that when overexpressed produces a stable protein which impacts mitochondrial membrane potential, suggesting a potential third coding function.
    CONCLUSIONS: Our results reveal an essential role for the Tug1 locus in male fertility and uncover evidence for distinct molecular modes in the Tug1 locus, thus highlighting the complexity present at lncRNA loci.
    Keywords:  Allele-specific; Cis-regulatory elements; DNA repressor; Fertility; Genetics; Genomics; Mouse; RNA-seq; Tug1; lncRNA
    DOI:  https://doi.org/10.1186/s13059-020-02081-5