bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2026–03–29
six papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Biochemistry. 2026 Mar 27.
      Microproteins are defined as polypeptides of 100-150 or fewer amino acids. With the integrated application of ribosome profiling (Ribo-Seq), mass spectrometry, and bioinformatic approaches, more microproteins have been identified as being encoded by small open reading frames (sORFs). The majority of microproteins are evolutionarily young and may represent species-specific events. This review highlights the current methods and their challenges for identification and characterization of novel microproteins. We will also summarize the biologically active microproteins that are involved in biological processes and essential for human physiology and pathology, followed by a discussion of their significant translational potential for diagnosis, prognosis, and therapeutic intervention in human diseases.
    Keywords:  Ribo-seq; immunopeptidomics; mass spectrometry; microproteins; noncanonical ORFs; proteomics; small ORFs; translation
    DOI:  https://doi.org/10.1021/acs.biochem.6c00063
  2. Nucleic Acids Res. 2026 Mar 19. pii: gkag275. [Epub ahead of print]54(6):
      Upstream open reading frames (uORFs) are critical regulators of messenger RNA translation, yet their evolutionary dynamics remain poorly understood. Here, we analyze uORF evolution across Drosophila species and uncover pervasive birth-death turnover. This process is characterized by a persistent excess of upstream start codon (uATG) gains over losses, shaped by the interplay of mutational input and natural selection. We find that the evolutionary conservation of uATGs is strongly associated with translational evidence, indicating a tight coupling between uORF retention and translational output. Lineage-specific uATGs are linked to reduced translation of downstream coding sequences, revealing lineage-dependent regulatory effects. We further identify evolutionary compensation between uATG gain and loss events within genes, supported by functional assays demonstrating frequent and condition-dependent effects on translation. At the population level, canonical uORF variants show signatures of population-specific selection, suggesting a role for uORF turnover in local adaptation. Together, our results reveal how natural selection, translational regulation, and evolutionary turnover jointly shape the uORF landscape in Drosophila.
    DOI:  https://doi.org/10.1093/nar/gkag275
  3. Nucleic Acids Res. 2026 Mar 19. pii: gkag234. [Epub ahead of print]54(6):
      Non-canonical (i.e. unannotated) open reading frames (ncORFs) have until recently been omitted from reference genome annotations, despite evidence of their translation, limiting their incorporation into biomedical research. To address this, in 2022, we initiated the TransCODE consortium and built the first community-driven consensus catalog of human ncORFs, which was openly distributed to the research community via Ensembl-GENCODE. While this catalog represented a starting point for reference ncORF annotation, major technical and scientific issues remained. In particular, this initial catalog had no standardized framework to judge the evidence of translation for individual ncORFs. Here, we present an expanded and refined catalog of the human reference annotation of ncORFs. By incorporating more datasets and by lifting constraints on ORF length and start codon, we define a comprehensive set of 28 359 ncORFs that is nearly four times the size of the previous catalog. Furthermore, to aid users who wish to work with ncORFs with the strongest and most reproducible signals of translation, we utilized a data-driven framework (i.e. translation signature scores) to assess the accumulated evidence for any individual ncORF. Using this approach, we derive a subset of 10 127 ncORFs with translation evidence on par with canonical protein-coding genes, which we refer to as the primary set. This set can serve as a reliable reference for downstream analyses and validation, with a particular emphasis on high quality. Overall, this update reflects continuous community-driven efforts to make ncORFs accessible and actionable to the broader research public, and further iterations of the catalog will continue to expand and refine this resource.
    DOI:  https://doi.org/10.1093/nar/gkag234
  4. Am J Hum Genet. 2026 Mar 24. pii: S0002-9297(26)00106-0. [Epub ahead of print]
      The 5' untranslated region (5' UTR) of messenger RNAs (mRNAs) plays a central role in regulating protein synthesis initiation, particularly through the Kozak sequence and upstream open reading frames (uORFs). Genetic variants within these regulatory elements could affect translation, altering gene expression and contributing to clinical phenotypes in humans. We developed a computational method called 5ULTRA (5' Untranslated Region Annotation) for analysis of whole-exome sequencing and whole-genome sequencing data to detect, annotate, and prioritize 5' UTR variants with potential translation impact. 5ULTRA identifies single-nucleotide variants, indels, and splicing variants that affect uORFs by creating or disrupting start/stop codons and that alter Kozak sequence strength of either the uORFs or the main coding sequence. 5ULTRA incorporates recent uORF databases and provides comprehensive annotations. 5ULTRA implements a machine-learning score to prioritize candidate variants with predicted effects on translation and also provides specific mechanistic predictions. The score correlates strongly with experimentally measured protein-level effects of 5' UTR variants. We applied 5ULTRA to multiple genetics datasets across diverse disease contexts, identifying candidate variants including potential cancer-driving somatic mutations predicted to decrease ABI1 level or increase NRAS abundance; common variants associated with traits such as multiple sclerosis, lung function, and cardiovascular function, by altering protein levels of TAGAP, VRTN, and SPAAR, respectively; and rare germline variants in our cohort, including a splicing variant of RPSA leading to 5' UTR sequence alteration that causes congenital asplenia and a variant of TNF that could predispose to tuberculosis.
    Keywords:  5ULTRA; 5′ UTR; Kozak sequence; genetic disease; method; non-coding; software; translation; uORF
    DOI:  https://doi.org/10.1016/j.ajhg.2026.02.020
  5. Methods Protoc. 2026 Mar 10. pii: 45. [Epub ahead of print]9(2):
      Ribosome profiling, or Ribo-Seq, is a powerful tool for studying translation. It maps the positions of translating ribosomes on mRNAs, providing insights into actively expressed genes. Unlike mass spectrometry, Ribo-Seq is not affected by the same biases that limit mass spectrometry, such as protein size, concentration, trypsin digestibility, or hydrophobicity. Thus, the translatome has previously been used to discover unannotated genes, including small and overlapping ones that were missed by mass spectrometry or gene prediction models. However, a major limitation of classical ribosome profiling is its complexity, involving multiple steps such as sucrose density gradient centrifugation and gel electrophoresis. These make the method costly, time-consuming, and limit its throughput. Here, we compared the classical method using gradient centrifugation and size exclusion by gel electrophoresis with shortened versions to evaluate experimental performance and achieved reductions. Our results show that the sucrose density gradient centrifugation is essential for obtaining accurate Ribo-Seq data, whereas gel electrophoresis for size selection can be omitted (although this requires increased sequencing depth). Thus, future experiments can be conducted with reduced sample input and hands-on time while still achieving a reliable quantification of translation.
    Keywords:  Ribo-Seq; bacterial gene expression; microbial translatomics; ribosome profiling; shortened workflow; translation quantification
    DOI:  https://doi.org/10.3390/mps9020045
  6. J Med Virol. 2026 Apr;98(4): e70894
      Inflammasomes orchestrate the inflammatory response against bacterial and viral infections, thereby initiating the synthesis of pro-inflammatory cytokines, mainly IL-1β and IL-18. SARS-CoV-2 infection induces an inflammatory response mediated by the activation of NLRP1 and NLRP3 inflammasomes. In this study, we demonstrated that the open reading frame 7b (ORF7b) accessory protein of SARS-CoV-2 induces the NLRP3 inflammasome in a recombinant HEK293T model. This resulted in an increase in the distribution of NLRP3 puncta, ASC-specking cells, and caspase-1 activation. ORF7b expression also induced the dispersion of the trans-Golgi network, a well-known step in the activation of the NLRP3 inflammasome. This study proposes a novel additional mechanism by which SARS-CoV-2 promotes NLRP3 inflammasome activation by ORF7b.
    Keywords:  ASC speck; COVID‐19; IL‐1; NLRP3; SARS‐CoV‐2; inflammation
    DOI:  https://doi.org/10.1002/jmv.70894