bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2025–07–13
four papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Plant Commun. 2025 Jul 08. pii: S2590-3462(25)00199-3. [Epub ahead of print] 101437
      A substantial but largely uncharted fraction of eukaryotic proteomes is composed of peptides and small proteins (peptidome). In recent years, short open reading frames (sORFs) that could encode functional peptides have been identified in transcripts annotated as non-coding RNAs or in intergenic regions. These sORF-encoded peptides (SEPs) were overlooked in the past due to their small size and difficulty of detection, both experimentally and computationally. However, analyses of translating RNAs (ribosome profiling) and proteomics (mass spectrometry) have provided growing evidence of the existence of a large number of novel, 'non-conventional' peptides in eukaryotic organisms, including plants. In animals, evidence has accumulated indicating that long non-coding RNAs are an important source of SEPs, and that SEPs participate in crucial cellular and physiological processes and can mediate the evolution of novel characteristics. Similar findings are starting to emerge in plants. The SEP-coding capacity and the full repertoire of functional SEPs of any eukaryotic genome are still unclear, but systematic, large-scale molecular screenings are starting to address this issue. Here, we review current progress in the understanding of the plant 'non-conventional' peptidome, explore parallels between plants and animals, and illustrate how findings in animals can help guide plant research on this topic.
    Keywords:  Ribo-Seq; mass spectrometry; microproteins; non-conventional peptides; peptidome; sORF-encoded peptides
    DOI:  https://doi.org/10.1016/j.xplc.2025.101437
  2. Bioinform Adv. 2025 ;5(1): vbaf134
       Motivation: The 5' untranslated region (5' UTR) of mRNA is crucial for the molecule's translatability and stability, making it essential for designing synthetic biological circuits for high and stable protein expression. Several UTR sequences are patented and widely used in laboratories. This paper presents UTRGAN, a Generative Adversarial Network (GAN)-based model for generating 5' UTR sequences, coupled with an optimization procedure to ensure high expression for target gene sequences or high ribosome load and translation efficiency.
    Results: The model generates sequences mimicking various properties of natural UTR sequences and optimizes them to achieve (i) up to five-fold higher average predicted expression on target genes, (ii) up to two-fold higher predicted mean ribosome load, and (iii) a 34-fold higher average predicted translation efficiency compared to initial UTR sequences. UTRGAN-generated sequences also exhibit higher similarity to known regulatory motifs in regions such as internal ribosome entry sites, upstream open reading frames, G-quadruplexes, and Kozak and initiation start codon regions. In-vitro experiments show that the UTR sequences designed by UTRGAN result in a higher translation rate for the human TNF- α protein compared to the human Beta Globin 5' UTR, a UTR with high production capacity.
    Availability and Implementation: The source code, including the model implementation and the optimization are released at http://github.com/ciceklab/UTRGAN. We downloaded the dataset from the UTRdb 2.0 database and available within the GitHub repository.
    DOI:  https://doi.org/10.1093/bioadv/vbaf134
  3. J Dairy Sci. 2025 Jul 08. pii: S0022-0302(25)00485-0. [Epub ahead of print]
      Dairy goat milk possesses substantial nutritional value, and comprehending the regulatory mechanisms of lactation is crucial for enhancing the milk production performance of dairy goats. During lactation, the mammary gland of dairy goats exhibits marked alterations in the expression of numerous genes. While extensive research has clarified the mechanisms governing mammary gene expression at the transcriptional level, the regulation of these genes at the translational level remains largely unexplored. In this study, ribosome-sequencing and RNA-sequencing analyses were conducted on the mammary glands of dairy goats during both the nonlactation and lactation periods. The findings revealed that the lactation process significantly influences both the translation and transcription of genes, with a notably higher overall translation efficiency (TE) observed during lactation compared with the nonlactation period. Transcription and translation collaboratively regulate gene expression in mammary tissues, thereby constructing a complex regulatory network. We systematically identified small open reading frames (sORF) in mammary glands, demonstrating that upstream ORF suppress the translation of main ORF. The RBP were found to significantly influence the gene TE. Notably, the sORF3917 located in the 5' untranslated region of FASN gene was shown to regulate fatty acid synthesis-related gene expression, highlighting the role of sORF in lactation. This study presents novel insights into the regulatory mechanisms of lactation in dairy goats and provides valuable genetic resources for gene editing and breeding strategies aimed at enhancing dairy goat production.
    Keywords:  RNA-seq; Ribo-seq; fatty acid synthesis; mammary gland gene regulation; small open reading frames; translation efficiency
    DOI:  https://doi.org/10.3168/jds.2025-26363
  4. Neurochem Int. 2025 Jul 07. pii: S0197-0186(25)00092-0. [Epub ahead of print] 106019
      The microtubule associated protein tau (MAPT) and TAR DNA binding protein (TARDBP) genes play crucial roles in neurodegeneration. The tau protein encoded by MAPT is the main component of tau tangles, a pathologic hallmark of "tauopathies" such as Alzheimer's disease (AD). Cytosolic accumulations of TDP-43, encoded by TARDBP are characteristic for LATE (Limbic-predominant age-related TDP-43 encephalopathy) and other TDPopathies. In addition to the well-characterized mRNA splicing isoforms, both genes generate a multitude of circular RNAs (circRNAs). Both MAPT and TARDBP express circular RNA-specific exons characterized by suboptimal splice sites and lengths and are frequently derived from Alu-elements. Most circTau and likely all circTARDBP RNAs expressed in brain are human-specific, suggesting a possible unique contribution to human brain disease. TARDBP and MAPT circRNAs harbor open reading frames and circTau RNAs were shown to be translated into polypeptides in cells. Thus, circRNAs from the MAPT and TARDBP genes should be considered in molecular analysis of AD, LATE and other neurological diseases.
    DOI:  https://doi.org/10.1016/j.neuint.2025.106019