bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2021‒08‒22
two papers selected by
Thomas Martinez
Salk Institute for Biological Studies


  1. Yi Chuan. 2021 Aug 20. 43(8): 737-746
      Existing research has shown that there are a large amount of non-coding RNAs (ncRNAs) in organisms. Short open reading frames (sORFs) abundantly exist in molecular sequences inaccurately annotated as ncRNAs. Several sORFs can be transcribed and translated into evolutionarily conserved micropeptides, which were ignored in previous studies due to short sequence lengths and the limitations of research techniques. To date, sORF-encoded micropeptides with various functions have been found to play important roles in regulating vital biological activities. This article reviews the functional micropeptides which have been found in recent years, introduces the new micropeptide designated as MIAC that we have discovered and describes the related technologies for mining potential micropeptides, thereby providing insights and references for new micropeptide discovery for researchers.
    Keywords:  micropeptides; non-coding RNA; small open reading frames
    DOI:  https://doi.org/10.16288/j.yczz.21-167
  2. Plant Cell. 2021 Aug 19. pii: koab211. [Epub ahead of print]
      We developed a resource, the Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/), to solve central questions about the Arabidopsis thaliana proteome, such as the significance of protein splice forms and post-translational modifications (PTMs), or simply to obtain reliable information about specific proteins. PeptideAtlas is based on published mass spectrometry (MS) data collected through ProteomeXchange and reanalyzed through a uniform processing and metadata annotation pipeline. All matched MS-derived peptide data are linked to spectral, technical, and biological metadata. Nearly 40 million out of ∼143 million MS/MS (tandem MS) spectra were matched to the reference genome Araport11, identifying ∼0.5 million unique peptides and 17,858 uniquely identified proteins (only isoform per gene) at the highest confidence level (FDR 0.0004; 2 non-nested peptides ≥9 aa each), assigned canonical proteins, and 3543 lower-confidence proteins. Physicochemical protein properties were evaluated for targeted identification of unobserved proteins. Additional proteins and isoforms currently not in Araport11 were identified that were generated from pseudogenes, alternative start, stops, and/or splice variants, and small Open Reading Frames (sORFs); these features should be considered when updating the Arabidopsis genome. Phosphorylation can be inspected through a sophisticated PTM viewer. PeptideAtlas is integrated with community resources including TAIR, tracks in JBrowse, PPDB, and UniProtKB. Subsequent PeptideAtlas builds will incorporate millions more MS/MS data.
    DOI:  https://doi.org/10.1093/plcell/koab211