J Proteomics. 2020 Sep 16. pii: S1874-3919(20)30356-0. [Epub ahead of print] 103988
Short open reading frame-encoded peptides (SEP) represent a widely undiscovered part of the proteome. The detailed analysis of SEP has, despite inherent limitations such as incomplete sequence coverage, challenges encountered with protein inference, the identification of posttranslational modifications and the assignment of potential N- and C-terminal truncations, predominantly been assessed using bottom-up proteomic workflows. The use of top-down based proteomic workflows is capable of providing an unparalleled level of characterization information, which is of increased importance in the case of alternatively encoded protein products. However, top-down based analysis is not without its own limitations, for which efficient separation prior to MS analysis is a major issue. We established a sample preparation approach for the combined bottom-up and top-down proteomic analysis of SEP. Key improvements were made by the application of solid phase extraction (SPE), which supported enrichment of proteins below ca. 20 kDa, followed by 2D-LC-MS top-down analysis encompassing both HCD and EThcD ion activation. Bottom-up experiments were used to support and confirm top-down data interpretation. This strategy allowed for the top-down characterization of 36 proteoforms mapping to 12 SEP from the archaeon Methanosarcina mazei strain Gö1, with the concurrent detection and identification of several posttranslational modifications in SEP. BIOLOGICAL SIGNIFICANCE: Small or short open reading frames (sORF) have been widely neglected in genome research in the past. With their increasing discovery, the question about the presence and molecular function of their translation products, the short open reading frame-encoded peptides (SEP), arises. As these small proteins are usually below the 10 kDa range, the number of peptides identifiable by bottom-up proteomics is limited which hampers both the identification and the recognition of potential posttranslational modifications. The presented top-down approach allowed for the detection of full length SEP, as well as of terminally truncated proteoforms, and further enabled the identification of disulfide bonds in these small proteins. This demonstrates, that this yet widely undiscovered part of the proteome undergoes the same modifications as classical proteins which is an essential step for future understanding of the biological functions of these molecules.
Keywords: Disulfide; Microprotein; Short open reading frame; Small open reading frame; Terminomics; Top down