bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2023–12–03
five papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. FEMS Microbiol Rev. 2023 Nov 27. pii: fuad063. [Epub ahead of print]
      The ever-growing repertoire of genomic techniques continues to expand our understanding of the true diversity and richness of prokaryotic genomes. Riboproteogenomics laid the foundation for dynamic studies of previously overlooked genomic elements. Most strikingly, bacterial genomes were revealed to harbor robust repertoires of small open reading frames (sORFs) encoding a diverse and broadly expressed range of small proteins, or sORF-encoded polypeptides (SEPs). In recent years, continuous efforts led to great improvements in the annotation and characterization of such proteins, yet many challenges remain to fully comprehend the pervasive nature of small proteins and their impact on bacterial biology. In this work, we review the recent developments in the dynamic field of bacterial genome reannotation, catalog the important biological roles carried out by small proteins and identify challenges obstructing the way to full understanding of these elusive proteins.
    Keywords:  bacterial pathogens; genome (re)annotation; proteomics; riboproteogenomics; small ORF (sORF); small ORF-encoded polypeptide (SEP)
    DOI:  https://doi.org/10.1093/femsre/fuad063
  2. RNA Biol. 2023 Jan;20(1): 943-954
      Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.
    Keywords:  RNA translation; Ribo-seq; Seps; ribosome profiling; smorfs
    DOI:  https://doi.org/10.1080/15476286.2023.2279845
  3. J Proteome Res. 2023 Nov 25.
      The low-molecular-weight proteins (LMWP) in serum and plasma are related to various human diseases and can be valuable biomarkers. A small open reading frame-encoded peptide (SEP) is one kind of LMWP, which has been found to function in many bioprocesses and has also been found in human blood, making it a potential biomarker. The detection of LMWP by a mass spectrometry (MS)-based proteomic assay is often inhibited by the wide dynamic range of serum/plasma protein abundance. Nanoparticle protein coronas are a newly emerging protein enrichment method. To analyze SEPs in human serum, we have developed a protocol integrated with nanoparticle protein coronas and liquid chromatography (LC)/MS/MS. With three nanoparticles, TiO2, Fe3O4@SiO2, and Fe3O4@SiO2@TiO2, we identified 164 new SEPs in the human serum sample. Fe3O4@SiO2 and a nanoparticle mixture obtained the maximum number and the largest proportion of identified SEPs, respectively. Compared with acetonitrile-based extraction, nanoparticle protein coronas can cover more small proteins and SEPs. The magnetic nanoparticle is also fit for high-throughput parallel protein separation before LC/MS. This method is fast, efficient, reproducible, and easy to operate in 96-well plates and centrifuge tubes, which will benefit the research on SEPs and biomarkers.
    Keywords:  biomarker; low-molecular-weight protein; nanoparticle; protein coronas; sORF-encoded peptides; serum
    DOI:  https://doi.org/10.1021/acs.jproteome.3c00608
  4. J Adv Res. 2023 Nov 24. pii: S2090-1232(23)00357-0. [Epub ahead of print]
       BACKGROUND: Mitochondria-derived peptides (MDPs) represent a recently discovered family of peptides encoded by short open reading frames (ORFs) found within mitochondrial genes. This group includes notable members including humanin (HN), mitochondrial ORF of the 12S rDNA type-c (MOTS-c), and small humanin-like peptides 1-6 (SHLP1-6). MDPs assume pivotal roles in the regulation of diverse cellular processes, encompassing apoptosis, inflammation, and oxidative stress, which are all essential for sustaining cellular viability and normal physiological functions. Their emerging significance extends beyond this, prompting a deeper exploration into their multifaceted roles and potential applications.
    AIM OF REVIEW: This review aims to comprehensively explore the biogenesis, various types, and diverse functions of MDPs. It seeks to elucidate the central roles and underlying mechanisms by which MDPs participate in the onset and development of cardiovascular diseases (CVDs), bridging the connections between cell apoptosis, inflammation, and oxidative stress. Furthermore, the review highlights recent advancements in clinical research related to the utilization of MDPs in CVD diagnosis and treatment.
    KEY SCIENTIFIC CONCEPTS OF REVIEW: MDPs levels are diminished with aging and in the presence of CVDs, rendering them potential new indicators for the diagnosis of CVDs. Also, MDPs may represent a novel and promising strategy for CVD therapy. In this review, we delve into the biogenesis, various types, and diverse functions of MDPs. We aim to shed light on the pivotal roles and the underlying mechanisms through which MDPs contribute to the onset and advancement of CVDs connecting cell apoptosis, inflammation, and oxidative stress. We also provide insights into the current advancements in clinical research related to the utilization of MDPs in the treatment of CVDs. This review may provide valuable information with MDPs for CVD diagnosis and treatment.
    Keywords:  Apoptosis; Cardiovascular diseases; Inflammation; Mitochondria-derived peptides; Oxidative stress; Therapeutic potentials
    DOI:  https://doi.org/10.1016/j.jare.2023.11.018
  5. Comput Biol Med. 2023 Nov 23. pii: S0010-4825(23)01217-9. [Epub ahead of print]168 107752
      The identification and function determination of long non-coding RNAs (lncRNAs) can help to better understand the transcriptional regulation in both normal development and disease pathology, thereby demanding methods to distinguish them from protein-coding (pcRNAs) after obtaining sequencing data. Many algorithms based on the statistical, structural, physical, and chemical properties of the sequences have been developed for evaluating the coding potential of RNA to distinguish them. In order to design common features that do not rely on hyperparameter tuning and optimization and are evaluated accurately, we designed a series of features from the effects of open reading frames (ORFs) on their mutual interactions and with the electrical intensity of sequence sites to further improve the screening accuracy. Finally, the single model constructed from our designed features meets the strong classifier criteria, where the accuracy is between 82% and 89%, and the prediction accuracy of the model constructed after combining the auxiliary features equal to or exceed some best classification tools. Moreover, our method does not require special hyper-parameter tuning operations and is species insensitive compared to other methods, which means this method can be easily applied to a wide range of species. Also, we find some correlations between the features, which provides some reference for follow-up studies.
    Keywords:  Open reading frame; Prediction; Protein-coding potential; Wavelet; lncRNA
    DOI:  https://doi.org/10.1016/j.compbiomed.2023.107752