bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2025–03–09
eight papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. bioRxiv. 2025 Feb 26. pii: 2025.02.19.639069. [Epub ahead of print]
      Thousands of short open reading frames (sORFs) are translated outside of annotated coding sequences. Recent studies have pioneered searching for sORF-encoded microproteins in mass spectrometry (MS)-based proteomics and peptidomics datasets. Here, we assessed literature-reported MS-based identifications of unannotated human proteins. We find that studies vary by three orders of magnitude in the number of unannotated proteins they report. Of nearly 10,000 reported sORF-encoded peptides, 96% were unique to a single study, and 12% mapped to annotated proteins or proteoforms. Manual curation of a benchmark dataset of 406 manually evaluated spectra from 204 sORF-encoded proteins revealed large variation in peptide-spectrum match (PSM) quality between studies, with immunopeptidomics studies generally reporting higher quality PSMs than conventional enzymatic digests of whole cell lysates. We estimate that 65% of predicted sORF-encoded protein detections in immunopeptidomics studies were supported by high-quality PSMs versus 7.8% in non-immunopeptidomics datasets. Our work stresses the need for standardized protocols and analysis workflows to guide future advancements in microprotein detection by MS towards uncovering how many human microproteins exist.
    DOI:  https://doi.org/10.1101/2025.02.19.639069
  2. J Transl Med. 2025 Feb 28. 23(1): 250
       BACKGROUND: CircRNAs are closely related to ferroptosis in gastric cancer cells; however, the mechanism by which circRNAs regulate ferroptosis in gastric carcinogenesis remains unknown. CircRNA-encoded novel peptides are functional products translated from the open reading frames (ORFs) within circular RNAs, demonstrating that circRNAs not only serve as non-coding regulators but also have the capacity to encode biologically active peptides. Compared with noncancerous cells, cancer cells have greater iron requirements, and ferroptosis occurs in response to radiotherapy, chemotherapy, and immunotherapy; therefore, ferroptosis activation may be a potential strategy to overcome the shortcomings of conventional cancer therapy.
    METHODS: A mouse model of ferroptosis in gastric cancer was constructed, and a bioinformatics analysis was performed to analyze and characterize the circRNAs involved in ferroptosis in gastric cancer. The inhibitory effect of hsa_circ_0002301 on ferroptosis in tumors was confirmed both in vitro and in vivo. The presence and expression of HECTD1-463aa were verified using mass spectrometry, protein blotting, and immunofluorescence staining. The molecular mechanism of hsa_circ_0002301 was investigated using mass spectrometry and immunoprecipitation.
    RESULTS: We designed and synthesized antibodies specific for the small protein HECTD1-463aa encoded by hsa_circ_0002301 to verify its presence and purified HECTD1-463aa by constructing hsa_circ_0002301 overexpression vectors with FLAG tags and used liquid chromatography-tandem mass spectrometry (LC‒MS/MS) to detect the characterized peptides. In addition, HECTD1 binding to HECTD1-463aa was identified by immunoprecipitation (Co-IP) and mass spectrometry. We found that HECTD1-463aa inhibited HECTD1-mediated GPX4 ubiquitination by binding to HECTD1, an important regulator of cell death in ferroptotic cancer cells.
    CONCLUSIONS: hsa_circ_0002301 competitively inhibits the degradation of the GPX4 protein by HECTD1 through the encoded proteins HECTD1-463aa and HECTD1 to affect the ferroptosis level in gastric cancer cells.
    DOI:  https://doi.org/10.1186/s12967-025-06226-7
  3. Plant Sci. 2025 Feb 26. pii: S0168-9452(25)00051-2. [Epub ahead of print]354 112433
      Small peptides (SPs), emerging as crucial signaling molecules in plants, regulate diverse processes such as plant development, stress tolerance, and nutrient acquisition. Consisting of fewer than 100 amino acids, SPs are classified into two main groups: precursor-derived SPs and small open reading frame (sORF)-encoded SPs, including miRNA-encoded SPs. SPs are secreted from various plant parts, with root-derived SPs playing particularly significant roles in stress tolerance and nutrient uptake. Even at low concentrations, root-derived SPs are highly effective signaling molecules that influence the distribution and effects of phytohormones, particularly auxin. For instance, under low phosphorus conditions, CLAVATA3/Embryo-Surrounding Region-Related (CLE/CLV), a root-derived SP, enhances root apical meristem differentiation and root architecture to improve phosphate acquisition. By interacting with CLV2 and PEPR2 receptors, it modulates auxin-related pathways, directing root morphology changes to optimize nutrient uptake. During nitrogen (N) starvation, root-derived SPs are transported to the shoot, where they interact with leucine-rich repeat receptor kinases (LRR-RKs) to alleviate nitrogen deficiency. Similarly, C-terminally Encoded Peptides (CEPs) are involved in primary root growth and N-acquisition responses. Despite the identification of many SPs, countless others remain to be discovered, and the functions of those identified so far remain elusive. This review focuses on the functions of root-derived SPs, such as CLE, CEP, RALF, RGF, PSK, PSY, and DVL, and discusses the receptor-mediated signaling pathways involved. Additionally, it explores the roles of SPs in root architecture, plant development, and their metabolic functions in nutrient signaling.
    Keywords:  Nutrient signaling pathway; Plant metabolism; Root architecture; Root-derived small peptide; Small peptide; Stress tolerance
    DOI:  https://doi.org/10.1016/j.plantsci.2025.112433
  4. Brief Bioinform. 2025 Mar 04. pii: bbaf087. [Epub ahead of print]26(2):
      Cancer neoantigens are peptides that originate from alterations in the genome, transcriptome, or proteome. These peptides can elicit cancer-specific T-cell recognition, making them potential candidates for cancer vaccines. The rapid advancement of proteomics technology holds tremendous potential for identifying these neoantigens. Here, we provided an up-to-date survey about database-based search methods and de novo peptide sequencing approaches in proteomics, and we also compared these methods to recommend reliable analytical tools for neoantigen identification. Unlike previous surveys on mass spectrometry-based neoantigen discovery, this survey summarizes the key advancements in de novo peptide sequencing approaches that utilize artificial intelligence. From a comparative study on a dataset of the HepG2 cell line and nine mixed hepatocellular carcinoma proteomics samples, we demonstrated the potential of proteomics for the identification of cancer neoantigens and conducted comparisons of the existing methods to illustrate their limits. Understanding these limits, we suggested a novel workflow for neoantigen discovery as perspectives.
    Keywords:  cancer neoantigens; database-based search methods; de novo peptide sequencing; deep learning; proteomics
    DOI:  https://doi.org/10.1093/bib/bbaf087
  5. Mol Cell Proteomics. 2025 Mar 03. pii: S1535-9476(25)00036-2. [Epub ahead of print] 100938
      Human leukocyte antigen class I (HLA-I) molecules present short peptide sequences from endogenous or foreign proteins to cytotoxic T cells. The low abundance of HLA-I peptides poses significant technical challenges for their identification and accurate quantification. While mass spectrometry (MS) is currently a method of choice for direct system-wide identification of cellular immunopeptidome, there is still a need for enhanced sensitivity in detecting and quantifying tumor specific epitopes. As gas phase separation in data-dependent MS data acquisition (DDA) increased HLA-I peptide detection by up to 50%, here, we aimed to evaluate the performance of data-independent acquisition (DIA) in combination with ion mobility (diaPASEF) for high-sensitivity identification of HLA presented peptides. Our streamlined diaPASEF workflow enabled identification of 11,412 unique peptides from 12.5 million A375 cells and 3,426 8-11mers from as low as 500,000 cells with high reproducibility. By taking advantage of HLA binder-specific in-silico predicted spectral libraries, we were able to further increase the number of identified HLA-I peptides. We applied SILAC-DIA to a mixture of labeled HLA-I peptides, calculated heavy-to-light ratios for 7,742 peptides across 5 conditions and demonstrated that diaPASEF achieves high quantitative accuracy up to 4-fold dilution. Finally, we identified and quantified shared neoantigens in a monoallelic C1R cell line model. By spiking in heavy synthetic peptides, we verified the identification of the peptide sequences and calculated relative abundances for 13 neoantigens. Taken together, diaPASEF analysis workflows for HLA-I peptides can increase the peptidome coverage for lower sample amounts. The sensitivity and quantitative precision provided by DIA can enable the detection and quantification of less abundant peptide species such as neoantigens across samples from the same background.
    DOI:  https://doi.org/10.1016/j.mcpro.2025.100938
  6. Genes Dis. 2025 May;12(3): 101347
      circular RNA (circRNA) is a covalently closed single-stranded RNA that lacks 5' and 3' ends and has long been considered a noncoding RNA. With the development of high-throughput sequencing and bioinformatics technology, the understanding of circRNA has become increasingly advanced. Recent studies have shown that some cytoplasmic circRNAs can be effectively translated into detectable proteins, further indicating the importance of circRNA in cellular pathology and physiological functions. Internal ribosome entry site (IRES) and N6-methyladenosine (m6A) mediated cap-independent translation initiation are considered potential mechanisms of circRNA translation. Multiple circRNAs have been shown to play crucial roles in human cancer. This paper provides an overview of the nature and functions of circRNA and describes the possible mechanisms underlying the initiation of circRNA translation. We summarized the emerging functions of circRNA-encoded proteins in human cancer. Finally, we discuss the therapeutic potential of circRNAs and the challenges of research in this field. This review on circRNA translation will reveal a hidden human proteome and enhance our understanding of the importance of circRNAs in human malignant tumors.
    Keywords:  Cancer; Cap-independent; Circular RNA; Protein-coding circRNA; Translation
    DOI:  https://doi.org/10.1016/j.gendis.2024.101347
  7. J Mol Neurosci. 2025 Mar 06. 75(1): 30
      Recent improvements in the accuracy of long-read sequencing (LRS) technologies have expanded the scope for novel transcriptional isoform discovery. Additionally, these advancements have improved the precision of transcript quantification, enabling a more accurate reconstruction of complex splicing patterns and transcriptomes. Thus, this project aims to take advantage of these analytical developments for the discovery and analysis of RNA isoforms in the human brain. A set of novel transcript isoforms was compiled using three bioinformatic tools, quantifying their expression across eight replicates of the cerebellar hemisphere, five replicates of the frontal cortex, and six replicates of the putamen. By taking a subset of the novel isoforms consistent across all discovery methods, a set of 170 highly confident novel RNA isoforms was curated for downstream analysis. This set consisted of 104 messenger RNAs (mRNAs) and 66 long non-coding RNAs (lncRNAs) isoforms. The detailed structure, expression, and potential encoded proteins of novel mRNA isoform BambuTx321 have been further described as an exemplary representative. Additionally, the tissue-specific expression [mean counts per million (CPM) of 5.979] of novel lncRNA, BambuTx1299, in the cerebellar hemisphere was observed. Overall, this project has identified and annotated several novel RNA isoforms across diverse tissues of the human brain, providing insights into their expression patterns and investigating their potential functional roles. Thus, this project has contributed to a more comprehensive understanding of the brain's transcriptomic landscape for applications in basic research.
    Keywords:  Alternative splicing; Brain; Long-read sequencing; Novel isoforms; Transcript isoforms; Transcriptomics
    DOI:  https://doi.org/10.1007/s12031-025-02316-9