bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2026–05–24
ten papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Protein Sci. 2026 Jun;35(6): e70636
      The illustrations of intricate molecular machineries inside cells created by David Goodsell continue to inspire the scientific community. Here, we aim to extend his artworks to include microproteins, a newly recognized class of small proteins with less than 100 amino acids, encoded by small open reading frames. Given the rapidly expanding number of identified microproteins, potentially exceeding the number of canonical proteins, we highlight, in this perspective article, diverse computational approaches to classify these proteins. By predicting localization, assessing structural homology, and modeling environments and dynamics of microproteins, these methods could provide clues about the subcellular localization of these microproteins and their structural domain homology, guiding further investigation into their biological functions in living systems.
    Keywords:  membrane; microproteins; modeling; organelles; structure prediction; subcellular localization
    DOI:  https://doi.org/10.1002/pro.70636
  2. Funct Integr Genomics. 2026 May 20. pii: 106. [Epub ahead of print]26(1):
      Small open reading frames (sORFs) are increasingly recognized as crucial regulators in bacterial gene expression, yet their biological roles remain largely unexplored in pathogenic species. Here, we investigated the genome-wide regulatory landscape of sORFs inLeptospira interrogans serovar Manilae strain UP-MMC-NIID-LP using RNA-seq-based transcriptomic profiling integrated with weighted gene co-expression network analysis (WGCNA). This study followed the targeted disruption of lomA, a gene mediating 4-methylcytosine (4mC) DNA modification. Loss of 4mC was associated with broad transcriptional dysregulation and phenotypic impairments, including reduced motility, adhesion, and virulence. Analysis of 363 predicted sORFs identified 31 with significant differential expression (FDR < 0.05, |log₂FC| ≥ 1) across wild-type, mutant, and complemented strains. Gene co-expression networks constructed using WGCNA and prioritized via topological ranking algorithms in Cytoscape. This analysis revealed seven high-confidence putative hub-like sORFs, which were significantly enriched in pathways related to flagellar assembly, DNA recombination, and transcriptional regulation. These candidates appear to function as core components supporting genome stability and adaptive stress responses. Several previously uncharacterized sORFs occupied central positions within co-expression modules, highlighting their potential roles in metabolic and regulatory networks. To our knowledge, this represents the first genome-wide integration of methylation-driven sORF regulation in Leptospira, revealing small proteins as associated with the link between epigenetic control bacterial pathogenicity and adaptability. These findings provide a foundation for future strategies targeting sORF-mediated regulation in pathogenic spirochetes.
    Keywords:   Leptospira interrogans ; Epigenetic regulation; Gene module network; RNA-seq; Small open reading frames; Virulence
    DOI:  https://doi.org/10.1007/s10142-026-01882-4
  3. Mol Cell. 2026 May 19. pii: S1097-2765(26)00275-3. [Epub ahead of print]
      Thousands of non-canonical open reading frames (ORFs) in the human transcriptome are translated into microproteins, many with ribosome occupancy comparable to canonical proteins. Intriguingly, most microproteins fail to accumulate as stable proteins; instead, their derived peptides are widely presented by human leukocyte antigen class I (HLA-I) molecules and show emerging immunomodulatory roles. To understand the underlying biology, we explored the folding and stability landscape of a large microprotein cohort, revealing a fundamental rule that connects the genetic code, protein folding, and stability. Structural modeling and parallel profiling revealed that most microproteins are intrinsically disordered and rapidly degraded. Mechanistically, the high GC content of microprotein-coding sequences, which facilitates non-canonical translation, enriches for residues encoded by multiple GC-rich codons (primarily glycine, arginine, alanine, and proline), thereby promoting structural disorder and terminal-residue motif-mediated, Cullin-RING E3 ubiquitin ligase (CRL)-dependent proteasomal degradation. Together, our findings establish a concise, quantitative rule by which high GC content constrains protein evolvability, revealing how surveillance machinery differentially targets microproteins versus canonical proteins.
    Keywords:  GC content; genetic code; lncORF; microprotein; non-canonical translation; protein stability; protein structure; uORF
    DOI:  https://doi.org/10.1016/j.molcel.2026.04.021
  4. Cardiovasc Res. 2026 May 19. pii: cvag110. [Epub ahead of print]
       AIMS: Microproteins (miPs) translated from small open reading frames (smORFs) are crucial regulators of cell function. However, the expression and function of miPs in endothelial cells, and alterations in miP expression linked with inflammation and cardiovascular disease, remain largely unexplored.
    METHODS AND RESULTS: An optimized proteo-genomic approach combining RiboTag RNA-sequencing and mass spectrometry of the small molecular mass proteome was utilized to identify endothelial cell-specific miPs. Heart, lung and blood vessels from endothelial cell-specific RiboTag mice and human endothelial cells were studied under homeostatic and inflammatory conditions. We identified 2,739 murine as well as 1,365 intracellular and 607 extracellular human endothelial cell miPs encoded from previously unannotated non-canonical smORFs. Vascular inflammation induced in vitro by interleukin-1β and in vivo through PCSK9 overexpression, high fat diet and partial carotid artery ligation, significantly altered smORF expression. An additional 347 miPs were detected in human serum, 23 decreasing and 31 increasing after cardiac damage. The expression of an inflammation-induced miP encoded by an internal smORF within the proline-serine-threonine phosphatase interacting protein 2 (PSTPIP2) transcript, i.e. miP-PSTPIP2, was assessed using a custom antibody. miP-PSTPIP2 expression was upregulated in IL-1β-treated human endothelial cells, in pre-atherosclerotic murine carotid arteries and detected in carotid arteries from patients with atherosclerosis. The relevance of 250 miPs for endothelial cell growth and viability was demonstrated using a high-throughput CRISPR/Cas9 screen.
    CONCLUSIONS: Taken together, we document the existence of a large number of human and murine miPs encoded by non-canonical smORFs and their altered expression in inflammatory conditions. The identification of secreted miPs suggests that they may also exert autocrine or paracrine functions. These novel small peptides modulate cell proliferation and survival in endothelial cells and may play a significant role in human cardiovascular disease.
    DOI:  https://doi.org/10.1093/cvr/cvag110
  5. J Mol Biol. 2026 May 21. pii: S0022-2836(26)00249-4. [Epub ahead of print] 169876
      RNA world hypothesis attributes both genetic information processing and catalytic functions to ancient RNAs. In modern life, proteins have replaced RNAs in many essential cellular processes; however, RNAs have retained key positions in regulatory networks, with numerous RNA relics still active for various enzymatic and structural functions. An emerging class of riboregulators named dual-function RNAs has noncoding regulatory base-pairing activity and protein-coding capacities within the same molecule. Recent advances in computational and experimental approaches, including ribosome profiling and cryo-electron microscopy, have led to intriguing discoveries of a large diversity of very short proteins or highly structured, large ornate RNAs thus expanding our knowledge of the functional range for both noncoding and coding sequences. Additionally, new antiphage defense systems are continuously being discovered that challenge the dogma of traditional coding and noncoding functions through mechanisms such as non-canonical cryptic protein-coding from ncRNA genes or the oligomerization of small proteins or ncRNAs into large supramolecular complexes. The current landscape of expressed sequences is also constantly evolving due to de novo gene birth from noncoding sequences and the emergence of noncoding RNAs from coding sequences. In this review, we present recent insights into the role of ncRNAs and small proteins in bacteria-phage interactions, highlighting the overlapping functions associated with coding and noncoding sequences.
    Keywords:  bacterial immunity; dual-function RNAs; ornatelncRNAs; small open reading frames
    DOI:  https://doi.org/10.1016/j.jmb.2026.169876
  6. Genome Biol. 2026 May 18.
       BACKGROUND: Nucleotide sequence can be translated in three reading frames producing distinct protein products. Many examples of RNA translation in two reading frames (dual coding) have been identified so far.
    RESULTS: We report translation of mRNA transcripts derived from SRD5A1 locus in all three reading frames that result in the synthesis of long polypeptides. This occurs due to initiation at three nearby AUG codons occurring in all three reading frames. Only one of the three proteoforms contains the conserved catalytical domain of SRD5A1 produced either from the second or the third AUG codon depending on the transcript. Paradoxically, ribosome profiling data and expression reporters indicate that the most efficient translation would produce catalytically inactive polypeptide. While phylogenetic analysis suggests that the long triple decoding region is specific to primates, occurrence of nearby AUGs in all three reading frames is ancestral to placental mammals. This suggests that their evolutionary significance belongs to regulation of translation rather than biological role of their products. By analysing multiple publicly available ribosome profiling data and with gene expression assays carried out in different cellular environments, we show that relative expression of these proteoforms is mutually dependent and varies across environments supporting this conjecture. We show that a remarkable feature of triple decoding is its resistance to frameshift causing variants with apparent implications to clinical interpretation of genomic sequence variants.
    CONCLUSIONS: We argue for the importance of identification, characterisation and annotation of productive RNA translation irrespective of the presumed biological roles of its products.
    Keywords:  Gene annotation; Overlapping genes; Protein synthesis; Ribosome decision graphs; SRD5A1; Translation control; Translation initiation; Translon; uORF
    DOI:  https://doi.org/10.1186/s13059-026-04106-x
  7. For Res (Fayettev). 2026 ;6 e004
      The functional roles of non-conventional peptides (NCPs) encoded by short open reading frames (sORFs) are increasingly recognized. However, their evolutionary conservation among closely related species remains largely unexplored. This study presented a genome-wide identification of NCPs in the hybrid poplar 84K (P. alba × P. glandulosa), and analyzed NCPs' sequence conservation across six sections of the Populus genus. Using LC-MS/MS with a custom six-frame-translated genome database, 516 conventional peptides (CPs), and 337 NCPs were indentified. NCPs exhibited distinct properties, including shorter length and lower molecular weight, compared to CPs. Tissue-specific expression patterns were prominent, with peptides functionally linked to photosynthesis in leaves, cell wall biosynthesis in stems, and nutrient uptake in roots. Allelic analysis revealed a parent-of-origin expression bias for over 10% of peptides, each set enriched in distinct metabolic pathways. Notably, NCP sequences were significantly less conserved than CPs across the genus, though specific conserved motifs were identified. This work provides the first systematic NCP resource for a model hybrid tree, establishing a foundational platform for leveraging peptide biology in molecular forestry and hybrid breeding.
    Keywords:  Allele; Non-conventional peptides; Populus; Tissue-specific; sORFs
    DOI:  https://doi.org/10.48130/forres-0026-0004
  8. Amino Acids. 2026 May 22.
      Once dismissed as transcriptional artifacts, noncoding RNAs (ncRNAs) have gained recognition in recent years for their ability to participate in gene regulation, as well as their ability to encode functional molecules referred to as ncRNA-encoded peptides (ncPEPs). The discovery of ncPEPs has opened new avenues in proteomics and genomics research, revealing biological mechanisms that were previously unexplored. This review presents an extensive overview of the computational tools, databases, and in silico strategies used to identify ncRNA-encoded peptides across all major ncRNA classes, including long noncoding RNAs (lncRNAs), circular RNAs (circRNAs), and primary microRNAs (pri-miRNAs). Furthermore, we outline publicly available databases that compile experimentally validated and computationally predicted ncPEPs across multiple species, enabling systematic annotation and cross-referencing of candidate peptides. By highlighting the current challenges and emerging methodologies, we emphasize how computational methods continue to advance our ability to uncover hidden functional peptides within the noncoding transcriptome. These developments provide a framework for validating ncPEPs and elucidating their biological significance across diverse systems.
    Keywords:   In silico peptide discovery; Coding potential prediction; Long noncoding RNAs; Small open reading frames; ncPEPs; ncRNA-encoded peptides; sORFs
    DOI:  https://doi.org/10.1007/s00726-026-03531-3
  9. Microb Pathog. 2026 May 20. pii: S0882-4010(26)00296-2. [Epub ahead of print]217 108570
      Campylobacter jejuni (C. jejuni) is a prominent foodborne pathogen commonly associated with poultry, representing a potential concern for global public health. Here, we identified a novel cyclic antimicrobial peptide, N1-7567, encoded by a noncanonical small open reading frame (ORF) in a Bacillus licheniformis (B. licheniformis) isolate. N1-7567 (AFLKRFSCRLIRAGKYLSCLLQPAA) adopts a cyclic conformation formed by a disulfide bond between two cysteine residues. N1-7567 exhibited a minimum inhibitory concentration (MIC) against C. jejuni of 64 μg/mL. Studies revealed that N1-7567 associates with the bacterial membrane, disrupts bacterial membrane integrity, and subsequently penetrates into the cytoplasm. This process triggers nitric oxide (NO) release, reactive oxygen species (ROS) accumulation, adenosine triphosphate (ATP) leakage, and direct interaction with bacterial genomic DNA, collectively leading to bacterial growth inhibition. Notably, N1-7567 significantly reduced Galleria mellonella mortality. Together, these findings highlight N1-7567 as a promising candidate for controlling C. jejuni infection and transmission.
    Keywords:  Antimicrobial peptide; Campylobacter jejuni; Cyclic antimicrobial peptide; Membrane disruption
    DOI:  https://doi.org/10.1016/j.micpath.2026.108570
  10. Autophagy. 2026 May 19.
      Distal ischemic necrosis remains a major challenge in reconstructive surgery. Mitochondria and lysosomes interact via signaling and membrane contacts to maintain cellular homeostasis. Mitochondrial-derived peptide MOTS-c, encoded by the MT-RNR1/12S rRNA open reading frame, enhances mitochondrial function by reducing reactive oxygen species (ROS) and stabilizing the membrane potential, potentially preserving lysosomal integrity and reducing lysosomal membrane permeabilization (LMP). This study investigated the protective effects and underlying mechanisms of MOTS-c in ischemic flaps. RNA sequencing explored MOTS-c mechanisms in ischemic flaps. Tissue clearing, laser speckle contrast imaging and Doppler analyses revealed improved blood flow perfusion following MOTS-c treatment. Histological staining (HE, Masson, F-CHP) demonstrated enhanced angiogenesis and collagen remodeling. Western blotting, ELISA, and immunofluorescence were used to assess pyroptosis, macroautophagy/autophagy, LMP, and MAPK1/ERK2-MAPK3/ERK1-NFKB/NF-κB pathway-related proteins. MOTS-c reduced endothelial pyroptosis, enhanced autophagy, and attenuated LMP in ischemic flaps. Mechanistically, in vivo overexpression of PLA2G4A/cPLA2 (phospholipase A2, group IVA (calcium, calcium dependent)) via AAV confirmed that MOTS-c enhances autophagy and reduces pyroptosis and LMP by suppressing PLA2G4A phosphorylation. Furthermore, MOTS-c inhibited PLA2G4A via the MAPK1-MAPK3-NFKB signaling cascade, thereby reducing LMP and enhancing flap survival. These findings suggest that MOTS-c restores cellular homeostasis by targeting the PLA2G4A-LMP axis, representing a promising therapeutic strategy for improving outcomes in ischemic flap surgery.
    Keywords:  Ischemic flaps; MAPK1-MAPK3-NFKB signaling pathway; MOTS-c; lysosomal membrane permeabilization; pyroptosis
    DOI:  https://doi.org/10.1080/15548627.2026.2677180