bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2024–09–01
seven papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Biology (Basel). 2024 Jul 26. pii: 563. [Epub ahead of print]13(8):
      Small open reading frames (sORFs; <300 nucleotides or <100 amino acids) are widespread across all genomes, and an increasing variety of them appear to be translating from non-genic regions. Over the past few decades, peptides produced from sORFs have been identified as functional in various organisms, from bacteria to humans. Despite recent advances in next-generation sequencing and proteomics, accurate annotation and classification of sORFs remain a rate-limiting step toward reliable and high-throughput detection of small proteins from non-genic regions. Additionally, the cost of computational methods utilizing machine learning is lower than that of biological experiments, and they can be employed to detect sORFs, laying the groundwork for biological experiments. We present D-sORF, a machine-learning framework that integrates the statistical nucleotide context and motif information around the start codon to predict coding sORFs. D-sORF scores directly for coding identity and requires only the underlying genomic sequence, without incorporating parameters such as the conservation, which, in the case of sORFs, may increase the dispersion of scores within the significantly less conserved non-genic regions. D-sORF achieves 94.74% precision and 92.37% accuracy for small ORFs (using the 99 nt medium length window). When D-sORF is applied to sORFs associated with ribosomes, the identification of transcripts producing peptides (annotated by the Ensembl IDs) is similar to or superior to experimental methodologies based on ribosome-sequencing (Ribo-Seq) profiling. In parallel, the recognition of putative negative data, such as the intron-containing transcripts that associate with ribosomes, remains remarkably low, indicating that D-sORF could be efficiently applied to filter out false-positive sORFs from Ribo-Seq data because of the non-productive ribosomal binding or noise inherent in these protocols.
    Keywords:  genomic annotation; machine learning; motif prediction; ribosome sequencing; sORF; small open reading frames
    DOI:  https://doi.org/10.3390/biology13080563
  2. Plant Cell Environ. 2024 Aug 27.
      The availability of high-throughput sequencing technologies increased our understanding of different genomes. However, the genomes of all living organisms still have many unidentified coding sequences. The increased number of missing small open reading frames (sORFs) is due to the length threshold used in most gene identification tools, which is true in the genic and, more importantly and surprisingly, in the intergenic regions. Scanning the cucumber genome intergenic regions revealed 420 723 sORF. We excluded 3850 sORF with similarities to annotated cucumber proteins. To propose the functionality of the remaining 416 873 sORF, we calculated their codon adaptation index (CAI). We found 398 937 novel sORF (nsORF) with CAI ≥ 0.7 that were further used for downstream analysis. Searching against the Rfam database revealed 109 nsORFs similar to multiple RNA families. Using SignalP-5.0 and NLS, identified 11 592 signal peptides. Five predicted proteins interacting with Meloidogyne incognita and Powdery mildew proteins were selected using published transcriptome data of host-pathogen interactions. Gene ontology enrichment interpreted the function of those proteins, illustrating that nsORFs' expression could contribute to the cucumber's response to biotic and abiotic stresses. This research highlights the importance of previously overlooked nsORFs in the cucumber genome and provides novel insights into their potential functions.
    Keywords:  GO terms; codon usage analysis; host‐pathogen interactions; intergenic regions
    DOI:  https://doi.org/10.1111/pce.15104
  3. Immunology. 2024 Aug 26.
      The innate immune response is under selection pressures from changing environments and pathogens. While sequence evolution can be studied by comparing rates of amino acid mutations within and between species, how a gene's birth and death contribute to the evolution of immunity is less known. Short open reading frames, once regarded as untranslated or transcriptional noise, can often produce micropeptides of <100 amino acids with a wide array of biological functions. Some micropeptide sequences are well conserved, whereas others have no evolutionary conservation, potentially representing new functional compounds that arise from species-specific adaptations. To date, few reports have described the discovery of novel micropeptides of the innate immune system. The diversity of immune-related micropeptides is a blind spot for gene and functional annotation. Immune-related micropeptides represent a potential reservoir of untapped compounds for understanding and treating disease. This review consolidates what is currently known about the evolution and function of innate immune-related micropeptides to facilitate their investigation.
    Keywords:  antigens/peptides/epitopes; chemokines; micropeptides; molecular biology
    DOI:  https://doi.org/10.1111/imm.13850
  4. Int J Mol Sci. 2024 Aug 14. pii: 8848. [Epub ahead of print]25(16):
      Rainbow trout (Oncorhynchus mykiss, Walbaum, 1792) is an important economic cold-water fish that is susceptible to heat stress. To date, the heat stress response in rainbow trout is more widely understood at the transcriptional level, while little research has been conducted at the translational level. To reveal the translational regulation of heat stress in rainbow trout, in this study, we performed a ribosome profiling assay of rainbow trout liver under normal and heat stress conditions. Comparative analysis of the RNA-seq data with the ribosome profiling data showed that the folding changes in gene expression at the transcriptional level are moderately correlated with those at the translational level. In total, 1213 genes were significantly altered at the translational level. However, only 32.8% of the genes were common between both levels, demonstrating that heat stress is coordinated across both transcriptional and translational levels. Moreover, 809 genes exhibited significant differences in translational efficiency (TE), with the TE of these genes being considerably affected by factors such as the GC content, coding sequence length, and upstream open reading frame (uORF) presence. In addition, 3468 potential uORFs in 2676 genes were identified, which can potentially affect the TE of the main open reading frames. In this study, Ribo-seq and RNA-seq were used for the first time to elucidate the coordinated regulation of transcription and translation in rainbow trout under heat stress. These findings are expected to contribute novel data and theoretical insights to the international literature on the thermal stress response in fish.
    Keywords:  Ribo-seq; cold-water fish; high-temperature stress; translational efficiency; upstream open reading frames
    DOI:  https://doi.org/10.3390/ijms25168848
  5. J Virol. 2024 Aug 28. e0113224
      The 5' untranslated region (5'UTR) of many positive-stranded RNA viruses contain functional regulatory sequences. Here, we show that the porcine reproductive and respiratory syndrome virus (PRRSV), a member of arteriviruses, harbors small upstream open reading frames (uORFs) in its 5'UTR. Bioinformatics analysis shows that this feature is relatively well conserved among PRRSV strains and Arteriviridae. We also identified a uORF, namely uORF2, in the PRRSV strain JXwn06, that possesses translational activity and exerts a suppressive effect on the expression of the primary ORF evidenced by in vitro reporter assays. We tested its importance via reverse genetics by introducing a point mutation into the PRRSV infectious cDNA clone to inactivate the start codon of uORF2. The recovered mutant virus Mut2 surprisingly replicated to the same level as the wild-type virus (WT), but induced a higher level of inflammatory cytokines (e.g., TNF-α, IL-1β, and IL-6) both in vitro and in animal experiments, correlating well with more severe lung injury and higher death rate. In line with this, over-expression of uORF2 in transfected cells significantly inhibited poly(I:C)-induced expression of inflammatory cytokines. Together, our data support the idea that uORF2 encodes a novel, functional regulator of PRRSV virulence despite of its short size.
    IMPORTANCE: PRRSV has remained a major challenge to the world swine industry, but we still do not know much about its biology and pathogenesis. Here, we provide evidence to show that the 5'UTR of PRRSV strain JXwn06 harbors a functional uORF that has the coding capacity and regulates induction of inflammation as demonstrated by in vitro assays and animal experiment. The findings reveal a novel viral factor that regulates cellular inflammation and provide insight into the understanding of PRRSV pathogenesis.
    Keywords:  inflammation; porcine reproductive and respiratory syndrome virus; translation; uORF; virulence
    DOI:  https://doi.org/10.1128/jvi.01132-24
  6. Biomolecules. 2024 Aug 01. pii: 932. [Epub ahead of print]14(8):
      Translation is one of the main gene expression steps targeted by cellular stress, commonly referred to as translational stress, which includes treatment with anticancer drugs. While translational stress blocks the translation initiation of bulk mRNAs, it nonetheless activates the translation of specific mRNAs known as short upstream open reading frames (uORFs)-mRNAs. Among these, the ATF4 mRNA encodes a transcription factor that reprograms gene expression in cells responding to various stresses. Although the stress-induced translation of the ATF4 mRNA relies on the presence of uORFs (upstream to the main ATF4 ORF), the mechanisms mediating this effect, particularly during chemoresistance, remain elusive. Here, we report that ALKBH5 (AlkB Homolog 5) and FTO (FTO: Fat mass and obesity-associated protein), the two RNA demethylating enzymes, promote the translation of ATF4 mRNA in a transformed liver cell line (Hep3B) treated with the chemotherapeutic drug sorafenib. Using the in vitro luciferase reporter translational assay, we found that depletion of both enzymes reduced the translation of the reporter ATF4 mRNA upon drug treatment. Consistently, depletion of either protein abrogates the loading of the ATF3 mRNA into translating ribosomes as assessed by polyribosome assays coupled to RT-qPCR. Collectively, these results indicate that the ALKBH5 and FTO-mediated translation of the ATF4 mRNA is regulated at its initiation step. Using in vitro methylation assays, we found that ALKBH5 is required for the inhibition of the methylation of a reporter ATF4 mRNA at a conserved adenosine (A235) site located at its uORF2, suggesting that ALKBH5-mediated translation of ATF4 mRNA involves demethylation of its A235. Preventing methylation of A235 by introducing an A/G mutation into an ATF4 mRNA reporter renders its translation insensitive to ALKBH5 depletion, supporting the role of ALKBH5 demethylation activity in translation. Finally, targeting either ALKBH5 or FTO sensitizes Hep3B to sorafenib-induced cell death, contributing to their resistance. In summary, our data show that ALKBH5 and FTO are novel factors that promote resistance to sorafenib treatment, in part by mediating the translation of ATF4 mRNA.
    Keywords:  ALKBH5; FTO; RNA methylation; stress response; translation regulation
    DOI:  https://doi.org/10.3390/biom14080932
  7. Int J Biol Macromol. 2024 Aug 23. pii: S0141-8130(24)05857-4. [Epub ahead of print] 135051
      Follicular atresia in chickens seriously reduced the egg production and economic benefits of chickens. LncRNA plays a key role in the process of follicular atresia. In this study, RNA-seq and Ribo-seq were performed on normal and atretic follicles of Dahen broilers to screen out lncRNAs that may regulate follicle atresia, and to study the molecular mechanisms of their regulation. GRN granulin precursor (lncGRN, ID: 101748909) was highly expressed in atretic follicles with translational ability. A molecular regulatory network of lncGRN/miR-103-3p/FBXW7 was constructed through bioinformatics analysis and dual luciferase reporting. LncGRN promoted the expression of FBXW7 by adsorption of miR-103-3p, thereby inhibiting the proliferation of chicken granulosa cells (GCs), promoting apoptosis of chicken GCs and inhibiting steroid hormone synthesis thus induced follicular atresia. Meanwhile, we also found a micropeptide named GRN-122aa derived by lncGRN which can promote follicular atresia. In conclusion, our study found that lncGRN promoted follicular atresia through the lncGRN/miR-103-3p/FBXW7 axis and the translation micropeptide GRN-122aa. This study provided new insight into the post-transcriptional regulation mechanism of lncGRN suggesting that lncGRN may act as a potential to regulate chicken follicle development, and provided a theoretical argument for further improving the egg production of chickens through molecular breeding.
    Keywords:  Follicular atresia; RNA-seq; lncGRN
    DOI:  https://doi.org/10.1016/j.ijbiomac.2024.135051