bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2026–03–08
nine papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Anal Chem. 2026 Mar 03.
      Microproteins, encoded by small open reading frames (sORFs), are polypeptides with fewer than 100 amino acids with unique structural and functional characteristics. Protein mass spectrometry is currently the de facto approach to verify the existence of microproteins, but the short length and the low abundance of microproteins pose significant challenges to their detection. Covalent organic frameworks (COFs) with adjustable pore sizes and hydrophobicities have shown excellent performance in the enrichment of short bioactive peptides. Here, we created COF-coated magnetic nanoparticles with an average pore size of 2.72 nm and verified their utility for the enrichment of microproteins from cell lysate. The material identified about 5× more microproteins than uncoated particles, where an average of 109 microproteins per MS run were unveiled with 45 min MS analysis time, and a total of 142 unique microproteins were identified across three replicates with a stringent FDR of 0.01%. The material unveils the greatest number of microproteins and the best reproducibility compared to other methods. We observed that COFs and acid precipitation unveil a unique set of microproteins, which were combined to identify 195 unique microproteins using a total MS instrument time of 4.5 h. Application of COFs with quantitative proteomics identified seven microproteins differentially upregulated during ferroptosis, including three novel microproteins that are robustly confirmed by high-quality MS/MS spectra. These results indicate that the COF offers a robust tool for the identification of microproteins.
    DOI:  https://doi.org/10.1021/acs.analchem.5c05160
  2. Adv Sci (Weinh). 2026 Mar 03. e15707
      Microproteins encoded by small open reading frames are a pivotal blind spot redefining the conventional protein-coding assumptions. However, the annotation of the "dark proteome" remains time- and labor-consuming due to the limited efficiency, sensitivity, and comprehensiveness of existing validation methods. To address these issues, we developed a comprehensive toolbox called CLAIMID to achieve accelerated and ultrasensitive validation at multiple biological scales. As the core of CLAIMID, molecularly imprinted polymers (MIPs), which are synthesized artificial antibodies toward putative microproteins, provide ultrasensitive and precise annotation in combination with surface-enhanced Raman scattering (SERS) detection. The excellent specificity, comparable to antibodies, of MIPs enables high anti-interference against biological matrix. The adaptability of MIPs engineering confers the rigorous validations by CLAIMID at multiple scales (single living cells, cell populations, and tissues) and diverse detection formats (SERS-based immunoassay and imaging, and mass spectrometric identification). Through CLAIMID, we rapidly confirmed the protein-level translation of four predicted microproteins-previously supported only by computational or ribosome profiling data-across various cell lines, and further identified three as potential tumor biomarkers, thereby demonstrating its university to putative microproteins. Together, we present an annotation toolbox with unparalleled efficiency, sensitivity, and scalability, moving forward for the advent of intriguing microprotein biology era.
    Keywords:  microproteins; molecularly imprinted polymers; non‐coding RNAs; small open reading frames; surface‐enhanced Raman scattering
    DOI:  https://doi.org/10.1002/advs.202515707
  3. Mol Cell. 2026 Mar 04. pii: S1097-2765(26)00105-X. [Epub ahead of print]
      The human genome harbors thousands of unannotated short open reading frames (sORFs) with the potential to encode microproteins, yet their physiological roles remain largely unexplored. Here, we developed sORF-seq, a functional screen that identified hundreds of sORF-encoded microproteins regulating cellular differentiation. Among these, we discovered lncPRESS1 as a critical regulator of cell fate, and remarkably, it acts as a bifunctional RNA. In the nucleus, lncPRESS1 functions as a long non-coding RNA (lncRNA), guiding the genomic distribution of SWI/SNF and orchestrating developmental gene expression programs. In the cytoplasm, lncPRESS1 acts as an mRNA, translated into a microprotein that directs lineage commitment through Sonic hedgehog (SHH) signaling pathways and interactions with the primary cilium. This dual functionality allows lncPRESS1 to coordinate nuclear and cytoplasmic regulatory networks, shaping early embryogenesis and human brain development. Our findings unveil an unexpected paradigm of non-canonical ORFs in choreographing complex pathways, expanding our understanding of the functional genome beyond traditional coding genes.
    Keywords:  bifunctional RNA; de novo gene; embryonic development; lineage commitment; lncRNA; microprotein; moonlighting; sORF-seq
    DOI:  https://doi.org/10.1016/j.molcel.2026.02.010
  4. Acta Neuropathol Commun. 2026 Mar 06.
      Oculopharyngodistal myopathy (OPDM) is characterized by ptosis, ophthalmoparesis, dysphagia, and distal weakness. Myopathological features include rimmed vacuoles and intranuclear inclusions. OPDM is associated with a pathogenic CGG repeat expansions in the 5'UTR of LRP12, NOTCH2NLC, GIPC1, RILPL1 and ABCD3. Translation of the repeat in the glycine reading frame has been demonstrated for expansions in FMR1, NOTCH2NLC and GIPC1. To assess for a similar phenomenon with LRP12, we expressed normal or expanded CGG repeats in the context of the 5'UTR of LRP12, upstream of a green fluorescent protein (GFP) in the three repeat reading frames. Repeat dependent translation occurs exclusively in the glycine reading frame. However, unlike other CGG repeat disorders, there is no proximal AUG, or near-AUG cognate initiated polyglycine (polyG) open reading frame in LRP12. Instead, our results support a model in which repeat-associated non-AUG (RAN) mediated polyG translation may initiate within the arginine reading frame and then undergo a + 1 translational frameshift into the glycine reading frame. LRP12-associated polyG products form intranuclear SQSTM1/ubiquitin positive inclusions that are cytotoxic and alter the nuclear lamina architecture in transfected cells. While FMR1-associated polyG inclusions are cytosolic, LRP12-associated polyG inclusions are nuclear in transfected skeletal muscle. LRP12 expansion carrier iPSC derived myotubes exhibit SQSTM1 positive intra- and peri- nuclear inclusions when compared with control patient myotubes, suggesting that polyG expression can occur in patients. Together, these findings provide evidence of RAN translation and polyG-toxicity in LRP12-associated OPDM pathology.
    Keywords:  CGG repeat expansion; LRP12; Oculopharyngodistal myopathy; Polyglycine; RAN translation
    DOI:  https://doi.org/10.1186/s40478-026-02272-4
  5. Microlife. 2026 ;7 uqag005
      Microproteins (≤70 amino acids) have important and often essential roles in all kingdoms of life, influencing cell motility, regulation of membrane transport and as transcription factors. In the halophilic archaeon and model system Haloferax volcanii a significant number of µ-proteins were predicted to be zinc finger proteins. Here we used mass spectrometry-based proteomics to systematically investigate the impact of single gene deletions of 19 zinc finger µ-proteins on the proteome of H. volcanii grown in synthetic medium with glucose as sole carbon and energy source. We employed a state-of-the-art dia-PASEF acquisition strategy, detecting over 3400 proteins across the 19 deletion strains and the wild type. The comprehensive proteome coverage enabled a systematic analysis of proteome remodeling. We found that in 11 out of the 19 mutants the proteome remodeling involved proteins annotated to play a role in cell motility, matching swarming and growth rate phenotypes we observed for these strains. Taken together, our data provide the most comprehensive proteome coverage of H. volcanii to date, and the effect of 19 different zinc-finger µ-proteins deletion strains on the proteome of this organism. The combined data (available via ProteomeXchange with identifier PXD066008) provide a valuable resource for future research in the field.
    Keywords:  Haloferax volcanii; cell motility; dia-PASEF; proteomics; small proteins; zinc-finger proteins
    DOI:  https://doi.org/10.1093/femsml/uqag005
  6. Mol Cell Proteomics. 2026 Feb 27. pii: S1535-9476(26)00040-X. [Epub ahead of print] 101544
      MiPEPs are microproteins encoded by primary transcripts of microRNAs (pri-miRNAs). Initially identified in plants, we recently characterized a miPEP in Drosophila melanogaster, named miPEP8, which is involved in the regulation of wing size. However, mechanisms at play are unknown. In the present study, we take advantage of the Drosophila cell line Schneider 2 (S2) to further investigate miPEP8 function at the molecular level. Overexpressing miPEP8 in S2 cells induced a reduction of cell size as well as an increase of the proportion of cells in the G1 phase of the cell cycle and a decrease of the autophagic flux. A proteomics analysis revealed that miPEP8 overexpression in S2 cells induces the upregulation of several proteins including the autophagosome cargo protein ref(2)P (the orthologue of the human p62/Sequestosome 1 protein). The interactome of miPEP8 was generated and revealed interactions between this miPEP8 and the mTORC1/autophagy pathway. Bioinformatics analysis identified a short linear motif (SLiM) on miPEP8 sequence. Mutation of this SLiM prevented the interaction between ref(2)P/p62 and miPEP8. Mutation of the SLiM also reverted the smaller cell size phenotype observed when overexpressing miPEP8 in S2 cells. RNA interference targeting ref(2)P/p62 reversed the cell size phenotype, suggesting that this protein plays a role in the regulation of cell size in Drosophila. Finally, the cell size phenotype was also observed in vivo on wings of flies either mutated or overexpressing miPEP8.
    DOI:  https://doi.org/10.1016/j.mcpro.2026.101544
  7. Mol Cell Endocrinol. 2026 Feb 27. pii: S0303-7207(26)00050-X. [Epub ahead of print] 112773
      Agouti-related peptide (AgRP) and neuropeptide Y (NPY) neurons in the arcuate nucleus integrate metabolic and inflammatory signals to control food intake. FAM237B (Gm8773/NPGM) is a putative orexigenic peptide enriched in a subset of AgRP neurons, yet misannotated as a long non-coding RNA in mice that has since been recognized as a 139 aa microprotein. Here, we combine evolutionary, transcriptomic, and physiological approaches to define FAM237B as an ancient, metabolically regulated neuropeptide within NPY/AgRP neurons. Comparative genomics and synteny analysis show that the Fam237b gene is conserved from jawless vertebrates to mammals and likely predates AgRP, with a highly conserved C-terminal region and sequence consistent with prohormone processing. Single-cell and bulk RNA sequencing reveal that Fam237b is enriched in mouse arcuate AgRP neurons and present at lower levels in the human hypothalamus. In NPY/AgRP hypothalamic cell models and in mice, Fam237b expression rises with fasting or serum withdrawal and is suppressed by insulin in parallel with Agrp. Insulin-mediated repression of Fam237b requires PI3K, but not MEK, signaling. Finally, pro-inflammatory stimuli (LPS, IL-6, and TNF-α) robustly increase Fam237b mRNA in primary hypothalamic cultures and NPY/AgRP cell models. These findings position FAM237B as an evolutionarily conserved micropeptide with a role in hypothalamic regulation of food intake and energy homeostasis, whose expression is jointly tuned by energy status, insulin signaling, and neuroinflammation.
    Keywords:  Fam237b; Gm8773; Hypothalamus; Insulin; NPGM
    DOI:  https://doi.org/10.1016/j.mce.2026.112773
  8. Proc Natl Acad Sci U S A. 2026 Mar 10. 123(10): e2511138123
      During development, cells sequentially acquire specific fates through temporally ordered regulatory systems. To ensure the harmonious progression, each system must be activated and subsequently inactivated at the appropriate time. In this study, we show that the duration of fate induction is controlled by the transient expression of polished rice (pri), a gene encoding micropeptides, during Drosophila tracheal development. pri is transiently expressed in prospective tracheal placodes and precedes the expression of trachealess (trh), a master transcription factor that initiates tracheal fate. pri induces the expression of trh through promoting the disappearance of the repressor form of the transcriptional factor Shavenbaby (Svb). Conversely, after placode invagination, artificially prolonging pri expression or constitutive loss of Svb leads to ectopic maintenance of trh expression in noninvaginated placode cells surrounding the properly invaginated domain. These results indicate that the rapid disappearance of pri properly terminates the initial fate induction system and suggest that this termination ensures a smooth transition to the subsequent fate-regulatory program-that is, the maintenance of tracheal cell fate specifically in the invaginated cells. Together, we propose that the transiency of pri serves as a cell-intrinsic molecular timer that controls the transient phase of cell fate induction and ensures the transition between sequential fate-regulatory systems, thereby enabling the precise coordination of cell identity with morphogenesis during organogenesis.
    Keywords:  Drosophila embryogenesis; cell fate regulation; micropeptide; short open reading frame (sORF); temporal regulation
    DOI:  https://doi.org/10.1073/pnas.2511138123