bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2023–03–12
six papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Nat Commun. 2023 Mar 10. 14(1): 1328
      The TINCR (Terminal differentiation-Induced Non-Coding RNA) gene is selectively expressed in epithelium tissues and is involved in the control of human epidermal differentiation and wound healing. Despite its initial report as a long non-coding RNA, the TINCR locus codes for a highly conserved ubiquitin-like microprotein associated with keratinocyte differentiation. Here we report the identification of TINCR as a tumor suppressor in squamous cell carcinoma (SCC). TINCR is upregulated by UV-induced DNA damage in a TP53-dependent manner in human keratinocytes. Decreased TINCR protein expression is prevalently found in skin and head and neck squamous cell tumors and TINCR expression suppresses the growth of SCC cells in vitro and in vivo. Consistently, Tincr knockout mice show accelerated tumor development following UVB skin carcinogenesis and increased penetrance of invasive SCCs. Finally, genetic analyses identify loss-of-function mutations and deletions encompassing the TINCR gene in SCC clinical samples supporting a tumor suppressor role in human cancer. Altogether, these results demonstrate a role for TINCR as protein coding tumor suppressor gene recurrently lost in squamous cell carcinomas.
    DOI:  https://doi.org/10.1038/s41467-023-36713-8
  2. NAR Genom Bioinform. 2023 Mar;5(1): lqad021
      The correct mapping of the proteome is an important step towards advancing our understanding of biological systems and cellular mechanisms. Methods that provide better mappings can fuel important processes such as drug discovery and disease understanding. Currently, true determination of translation initiation sites is primarily achieved by in vivo experiments. Here, we propose TIS Transformer, a deep learning model for the determination of translation start sites solely utilizing the information embedded in the transcript nucleotide sequence. The method is built upon deep learning techniques first designed for natural language processing. We prove this approach to be best suited for learning the semantics of translation, outperforming previous approaches by a large margin. We demonstrate that limitations in the model performance are primarily due to the presence of low-quality annotations against which the model is evaluated against. Advantages of the method are its ability to detect key features of the translation process and multiple coding sequences on a transcript. These include micropeptides encoded by short Open Reading Frames, either alongside a canonical coding sequence or within long non-coding RNAs. To demonstrate the use of our methods, we applied TIS Transformer to remap the full human proteome.
    DOI:  https://doi.org/10.1093/nargab/lqad021
  3. Front Plant Sci. 2023 ;14 1094715
      The roles of short/small open reading frames (sORFs) have been increasingly recognized in recent years due to the rapidly growing number of sORFs identified in various organisms due to the development and application of the Ribo-Seq technique, which sequences the ribosome-protected footprints (RPFs) of the translating mRNAs. However, special attention should be paid to RPFs used to identify sORFs in plants due to their small size (~30 nt) and the high complexity and repetitiveness of the plant genome, particularly for polyploidy species. In this work, we compare different approaches to the identification of plant sORFs, discuss the advantages and disadvantages of each method, and provide a guide for choosing different methods in plant sORF studies.
    Keywords:  genome; plant; ribo-seq; sORFs; small open reading frame
    DOI:  https://doi.org/10.3389/fpls.2023.1094715
  4. RNA. 2023 Mar 06. pii: rna.079525.122. [Epub ahead of print]
      It is estimated that nearly 50% of mammalian transcripts contain at least one upstream open reading frame (uORF), which are typically one to two orders of magnitude smaller than the downstream main ORF. Most uORFs are thought to be inhibitory as they sequester the scanning ribosome, but in some cases allow for translation re-initiation. However, termination in the 5' UTR at the end of uORFs resembles pre-mature termination that is normally sensed by the nonsense-mediated mRNA decay (NMD) pathway. Translation re-initiation has been proposed as a method for mRNAs to prevent NMD. Here we test how uORF length influences translation re-initiation and mRNA stability in HeLa cells. Using custom 5' UTRs and uORF sequences, we show that re-initiation can occur on heterologous mRNA sequences, favors small uORFs, and is supported when initiation occurs with more initiation factors. After determining reporter mRNA half-lives in HeLa cells and mining available mRNA half-life datasets for cumulative predicted uORF length, we conclude that translation re-initiation after uORFs is not a robust method for mRNAs to prevent NMD. Together, these data suggests that the decision of whether NMD ensues after translating uORFs occurs before re-initiation in mammalian cells.
    Keywords:  NMD; eIF; mRNA decay; ribosome; translational control
    DOI:  https://doi.org/10.1261/rna.079525.122
  5. Nat Biotechnol. 2023 Mar 09.
      The ability to control gene expression and generate quantitative phenotypic changes is essential for breeding new and desired traits into crops. Here we report an efficient, facile method for downregulating gene expression to predictable, desired levels by engineering upstream open reading frames (uORFs). We used base editing or prime editing to generate de novo uORFs or to extend existing uORFs by mutating their stop codons. By combining these approaches, we generated a suite of uORFs that incrementally downregulate the translation of primary open reading frames (pORFs) to 2.5-84.9% of the wild-type level. By editing the 5' untranslated region of OsDLT, which encodes a member of the GRAS family and is involved in the brassinosteroid transduction pathway, we obtained, as predicted, a series of rice plants with varied plant heights and tiller numbers. These methods offer an efficient way to obtain genome-edited plants with graded expression of traits.
    DOI:  https://doi.org/10.1038/s41587-023-01707-w