bims-metlip Biomed News
on Methods and protocols in metabolomics and lipidomics
Issue of 2023‒10‒01
sixteen papers selected by
Sofia Costa, Matterworks

  1. Anal Chem. 2023 Sep 28.
      We developed an accurate method for determining diacylglycerols (DAGs) in human plasma using a fluorous biphasic liquid-liquid extraction method, followed by liquid chromatography with tandem mass spectrometry (LC-MS/MS) analysis. The lipid mixture in the plasma was first extracted with chloroform by using the Bligh-Dyer method. The resulting solution was subjected to fluorous biphasic liquid-liquid extraction to remove phospholipids, which are known to cause matrix effects during the LC-MS/MS analysis. In this method, phospholipids in a lipid mixture solution (nonfluorous solvent) were selectively extracted to tetradecafluorohexane (fluorous solvent) via the specificity of fluorous affinity by forming a complex with a perfluoropolyethercarboxylic acid-lanthanum(III) salt. The remaining DAGs in the nonfluorous solvent could be directly injected into the LC system through the positive electrospray ionization-MS/MS mode. The removal rate of the phospholipids through the fluorous biphasic extraction was more than 99.9%; thus, the matrix-effect-eliminating analysis of DAGs in human plasma with LC-MS/MS was enabled. Furthermore, the applicability of this method and the possibility of using DAGs as biomarkers were evaluated by applying this method to human plasma samples obtained from major depressive disorder as a related disease.
  2. Metabolites. 2023 Aug 22. pii: 966. [Epub ahead of print]13(9):
      Liquid chromatography-mass spectrometry (LC-MS) is the key technique for analyzing complex lipids in biological samples. Various LC-MS modes are used for lipid separation, including different stationary phases, mobile-phase solvents, and modifiers. Quality control in lipidomics analysis is crucial to ensuring the generated data's reliability, reproducibility, and accuracy. While several quality control measures are commonly discussed, the impact of organic solvent quality during LC-MS analysis is often overlooked. Additionally, the annotation of complex lipids remains prone to biases, leading to potential misidentifications and incomplete characterization of lipid species. In this study, we investigate how LC-MS-grade isopropanol from different vendors may influence the quality of the mobile phase used in LC-MS-based untargeted lipidomic profiling of biological samples. Furthermore, we report the occurrence of an unusual, yet highly abundant, ethylamine adduct [M+46.0651]+ that may form for specific lipid subclasses during LC-MS analysis in positive electrospray ionization mode when acetonitrile is part of the mobile phase, potentially leading to lipid misidentification. These findings emphasize the importance of considering solvent quality in LC-MS analysis and highlight challenges in lipid annotation.
    Keywords:  MS/MS annotation; adduct formation; lipidomics; lipids; liquid chromatography; mass spectrometry; metabolomics; method development; misidentification; solvent quality
  3. bioRxiv. 2023 Sep 16. pii: 2023.09.15.558013. [Epub ahead of print]
      The heart contracts incessantly and requires a constant supply of energy, utilizing numerous metabolic substrates such as fatty acids, carbohydrates, lipids, and amino acids to supply its high energy demands. Therefore, a comprehensive analysis of various metabolites is urgently needed for understanding cardiac metabolism; however, complete metabolome analyses remain challenging due to the broad range of metabolite polarities which makes extraction and detection difficult. Herein, we implemented parallel metabolite extractions and high-resolution mass spectrometry (MS)-based methods to obtain a comprehensive analysis of the human heart metabolome. To capture the diverse range of metabolite polarities, we first performed six parallel liquid-liquid extractions (three monophasic, two biphasic, and one triphasic extractions) of healthy human donor heart tissue. Next, we utilized two complementary MS platforms for metabolite detection - direct-infusion ultrahigh-resolution Fourier-transform ion cyclotron resonance (DI-FTICR) and high-resolution liquid chromatography quadrupole time-of-flight tandem MS (LC-Q-TOF MS/MS). Using DI-FTICR MS, 9,521 metabolic features were detected where 7,699 were assigned a chemical formula and 1,756 were assigned an annotated by accurate mass assignment. Using LC-Q-TOF MS/MS, 21,428 metabolic features were detected where 626 metabolites were identified based on fragmentation matching against publicly available libraries. Collectively, 2276 heart metabolites were identified in this study which span a wide range of polarities including polar (benzenoids, alkaloids and derivatives and nucleosides) as well as non-polar (phosphatidylcholines, acylcarnitines, and fatty acids) compounds. The results of this study will provide critical knowledge regarding the selection of appropriate extraction and MS detection methods for the analysis of the diverse classes of human heart metabolites.Table of Contents Graphical Abstract:
  4. Metabolites. 2023 Sep 09. pii: 1002. [Epub ahead of print]13(9):
      Lipidomics refers to the full characterization of lipids present within a cell, tissue, organism, or biological system. One of the bottlenecks affecting reliable lipidomic analysis is the extraction of lipids from biological samples. An ideal extraction method should have a maximum lipid recovery and the ability to extract a broad range of lipid classes with acceptable reproducibility. The most common lipid extraction relies on either protein precipitation (monophasic methods) or liquid-liquid partitioning (bi- or triphasic methods). In this study, three monophasic extraction systems, isopropanol (IPA), MeOH/MTBE/CHCl3 (MMC), and EtOAc/EtOH (EE), alongside three biphasic extraction methods, Folch, butanol/MeOH/heptane/EtOAc (BUME), and MeOH/MTBE (MTBE), were evaluated for their performance in characterization of the mouse lipidome of six different tissue types, including pancreas, spleen, liver, brain, small intestine, and plasma. Sixteen lipid classes were investigated in this study using reversed-phase liquid chromatography/mass spectrometry. Results showed that all extraction methods had comparable recoveries for all tested lipid classes except lysophosphatidylcholines, lysophosphatidylethanolamines, acyl carnitines, sphingomyelines, and sphingosines. The recoveries of these classes were significantly lower with the MTBE method, which could be compensated by the addition of stable isotope-labeled internal standards prior to lipid extraction. Moreover, IPA and EE methods showed poor reproducibility in extracting lipids from most tested tissues. In general, Folch is the optimum method in terms of efficacy and reproducibility for extracting mouse pancreas, spleen, brain, and plasma. However, MMC and BUME methods are more favored when extracting mouse liver or intestine.
    Keywords:  UHPLC-HRMS; lipid extraction; mouse lipidome; mouse tissue; untargeted lipidomics
  5. Anal Chem. 2023 Sep 27.
      Phosphorus metabolites occupy a unique place in cellular function as critical intermediates and products of cellular metabolism. Human blood is the most widely used biospecimen in the clinic and in the metabolomics field, and hence an ability to profile phosphorus metabolites in blood, quantitatively, would benefit a wide variety of investigations of cellular functions in health and diseases. Mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy are the two premier analytical platforms used in the metabolomics field. However, detection and quantitation of phosphorus metabolites by MS can be challenging due to their lability, high polarity, structural isomerism, and interaction with chromatographic columns. The conventionally used 1H NMR, on the other hand, suffers from poor resolution of these compounds. As a remedy, 31P NMR promises an important alternative to both MS and 1H NMR. However, numerous challenges including the instability of phosphorus metabolites, their chemical shift sensitivity to solvent composition, pH, salt, and temperature, and the lack of identified metabolites have so far restricted the scope of 31P NMR. In the current study, we describe a method to analyze nearly 25 phosphorus metabolites in blood using a simple one-dimensional (1D) NMR spectrum. Establishment of the identity of unknown metabolites involved a combination of (a) comprehensively analyzing an array of 1D and two-dimensional (2D) 1H/31P homonuclear and heteronuclear NMR spectra of blood; (b) mapping the central carbon metabolic pathway; (c) developing and using 1H and 31P spectral and chemical shift databases; and finally (d) confirming the putative metabolite peaks with spiking using authentic compounds. The resulting simple 1D 31P NMR-based method offers an ability to visualize and quantify the levels of intermediates and products of multiple metabolic pathways, including central carbon metabolism, in one step. Overall, the findings represent a new dimension for blood metabolite analysis and are anticipated to greatly impact the blood metabolomics field.
  6. Chemometr Intell Lab Syst. 2023 Sep 15. pii: 104861. [Epub ahead of print]240
      We present metabolite identification software in the form of R Shiny. Metabolite identification by mass spectral matching in gas chromatography (GC-MS)-based untargeted metabolomics can be done by using the easy-to-use software. Various similarity measures are given and toy example using graphical user interface is presented.
    Keywords:  GC-MS; LC-MS; Mass spectral matching; Metabolite identification
  7. bioRxiv. 2023 Sep 15. pii: 2023.09.12.557189. [Epub ahead of print]
      Mass spectrometry is a powerful and widely used tool for generating proteomics, lipidomics, and metabolomics profiles, which is pivotal for elucidating biological processes and identifying biomarkers. However, missing values in spectrometry-based omics data may pose a critical challenge for the comprehensive identification of biomarkers and elucidation of the biological processes underlying human complex disorders. To alleviate this issue, various imputation methods for mass spectrometry-based omics data have been developed. However, a comprehensive and systematic comparison of these imputation methods is still lacking, and researchers are frequently confronted with a multitude of options without a clear rationale for method selection. To address this pressing need, we developed omicsMIC (mass spectrometrybased omics with Missing values Imputation methods Comparison platform), an interactive platform that provides researchers with a versatile framework to simulate and evaluate the performance of 28 diverse imputation methods. omicsMIC offers a nuanced perspective, acknowledging the inherent heterogeneity in biological data and the unique attributes of each dataset. Our platform empowers researchers to make data-driven decisions in imputation method selection based on real-time visualizations of the outcomes associated with different imputation strategies. The comprehensive benchmarking and versatility of omicsMIC make it a valuable tool for the scientific community engaged in mass spectrometry-based omics research. OmicsMIC is freely available at .
  8. Molecules. 2023 Sep 06. pii: 6457. [Epub ahead of print]28(18):
      Lavender (Lavandula angustifolia Miller or Lavandula officinalis Chaix) is an ethnopharmacological plant commonly known as English lavender. Linalool and linalyl acetate are putative phytoactives in lavender essential oil (LEO) derived from the flower heads. LEO has been used in aroma or massage therapy to reduce sleep disturbance and to mitigate anxiety. Recently, an oral LEO formulation was administered in human clinical trials designed to ascertain its anxiolytic effect. However, human pharmacokinetics and an LC-MS/MS method for the measurement of linalool are lacking. To address this deficiency, a rapid and sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed for the analysis of linalool in human serum. Prior to the analysis, a simple sample preparation protocol including protein precipitation and liquid-liquid extraction of serum samples was created. The prepared samples were analyzed using a C18 reversed-phase column and gradient elution (acetonitrile and water, both containing 0.1% formic acid). A Waters Xevo TQ-S tandem mass spectrometer (positive mode) was used to quantitatively determine linalool and IS according to transitions of m/z 137.1→95.1 (tR 0.79 min) and 205.2→149.1 (tR 1.56 min), respectively. The method was validated for precision, accuracy, selectivity, linearity, sensitivity, matrix effects, and stability, and it was successfully applied to characterize the oral pharmacokinetics of linalool in humans. The newly developed LC-MS/MS-based method and its application in clinical trial serum samples are essential for the characterization of potential pharmacokinetic and pharmacodynamic interactions.
    Keywords:  LC–MS/MS; human; linalool; pharmacokinetics
  9. Molecules. 2023 Sep 08. pii: 6523. [Epub ahead of print]28(18):
      Biological properties of menaquinone-7, one of the vitamin K2 vitamers (K2MK-7), both those proven and those that remain to be investigated, arouse extensive interest that goes beyond the strictly scientific framework. The most important of them is the prevention of age-related diseases, considering that we live in the times identified as the era of aging societies and many people are exposed to the vitamin K2MK-7 deficiency. Therefore, an effective analytical protocol that can be adopted as a diagnostic and preventive analytics tool is needed. Herein, a simple sample preparation method followed by the liquid chromatography-tandem mass spectrometry-based method (LC-MS/MS), was used for the selective and sensitive determination of K2MK-7 in serum samples. Under the optimized conditions, using 500 µL of serum and the same amount of n-hexane, the reproducibility and the accuracy were obtained in the ranges of 89-97% and 86-110%, respectively, and the limit of detection value was 0.01 ng/mL. This method was used for the routine analysis. Statistical interpretation of the data from 518 samples obtained during 2 years of practice allowed for obtaining information on the content and distribution of K2MK-7 in the Polish population, broken down by the sex and age groups.
    Keywords:  chromatographic analysis; diagnostic tool; extraction; menaquinone-7 analysis; population variability; sample preparation; vitamin K vitamers
  10. Anal Bioanal Chem. 2023 Sep 25.
      Testosterone (TTe) and free testosterone (FTe) are clinically important indicators for the diagnosis of androgen disorders, so accurate quantitative determination of them in serum is clinically of paramount significance. Currently, there is no available method suitable for routine and simultaneous measurement of TTe and FTe. Here, we developed a new UPLC-MS/MS method to quantify serum TTe and FTe simultaneously and accurately. Rapid equilibrium dialysis was used to obtain FTe in serum followed by derivatization with hydroxylamine hydrochloride. With these strategies, TTe and FTe could be measured in single injection. After optimizing the extraction and derivatization conditions, the performance of LC-MS/MS was evaluated and applied to quantify the levels of TTe and FTe in clinical samples from 42 patients. The assays were linear for TTe within the range of 0.2-30 ng/mL and for FTe within the range of 1.5-1000 pg/mL. This improved method provided a limit of quantification for TTe of 0.2 ng/mL and for FTe of 1.5 pg/mL. The intra- and inter-run CVs were less than 4.3% and 3.6% for TTe and less than 8.2% and 6.7% for FTe, respectively. The intra- and inter-run accuracies for both TTe and FTe were in the range of 96.1-108.1%. Interference, carryover effect, and matrix effect were in acceptable range. In conclusion, our new LC-MS/MS method is simple to perform and can serve as a reliable method for simultaneous determination of TTe and FTe in clinical practice, providing important information for diagnosis, treatment, and monitoring of androgen-related diseases.
    Keywords:  Free testosterone; LC–MS/MS; Serum; Testosterone
  11. Bio Protoc. 2023 Sep 20. 13(18): e4819
      Dietary saturated fatty acids (SFAs) are upregulated in the blood circulation following digestion. A variety of circulating lipid species have been implicated in metabolic and inflammatory diseases; however, due to the extreme variability in serum or plasma lipid concentrations found in human studies, established reference ranges are still lacking, in addition to lipid specificity and diagnostic biomarkers. Mass spectrometry is widely used for identification of lipid species in the plasma, and there are many differences in sample extraction methods within the literature. We used ultra-high performance liquid chromatography (UPLC) coupled to a high-resolution hybrid triple quadrupole-time-of-flight (QToF) mass spectrometry (MS) to compare relative peak abundance of specific lipid species within the following lipid classes: free fatty acids (FFAs), triglycerides (TAGs), phosphatidylcholines (PCs), and sphingolipids (SGs), in the plasma of mice fed a standard chow (SC; low in SFAs) or ketogenic diet (KD; high in SFAs) for two weeks. In this protocol, we used Principal Component Analysis (PCA) and R to visualize how individual mice clustered together according to their diet, and we found that KD-fed mice displayed unique blood profiles for many lipid species identified within each lipid class compared to SC-fed mice. We conclude that two weeks of KD feeding is sufficient to significantly alter circulating lipids, with PCs being the most altered lipid class, followed by SGs, TAGs, and FFAs, including palmitic acid (PA) and PA-saturated lipids. This protocol is needed to advance knowledge on the impact that SFA-enriched diets have on concentrations of specific lipids in the blood that are known to be associated with metabolic and inflammatory diseases. Key features • Analysis of relative plasma lipid concentrations from mice on different diets using R. • Lipidomics data collected via ultra-high performance liquid chromatography (UPLC) coupled to a high-resolution hybrid triple quadrupole-time-of-flight (QToF) mass spectrometry (MS). • Allows for a comprehensive comparison of diet-dependent plasma lipid profiles, including a variety of specific lipid species within several different lipid classes. • Accumulation of certain free fatty acids, phosphatidylcholines, triglycerides, and sphingolipids are associated with metabolic and inflammatory diseases, and plasma concentrations may be clinically useful.
    Keywords:  Circulating lipids; Free fatty acids; Ketogenic diet; Lipidomics; Mass spectrometry; Phosphatidylcholines; Sphingolipids; Triglycerides
  12. Metabolites. 2023 Aug 31. pii: 986. [Epub ahead of print]13(9):
      Gas chromatography-mass spectrometry (GC-MS) is suitable for the analysis of non-polar analytes. Free amino acids (AA) are polar, zwitterionic, non-volatile and thermally labile analytes. Chemical derivatization of AA is indispensable for their measurement by GC-MS. Specific conversion of AA to their unlabeled methyl esters (d0Me) using 2 M HCl in methanol (CH3OH) is a suitable derivatization procedure (60 min, 80 °C). Performance of this reaction in 2 M HCl in tetradeutero-methanol (CD3OD) generates deuterated methyl esters (d3Me) of AA, which can be used as internal standards in GC-MS. d0Me-AA and d3Me-AA require subsequent conversion to their pentafluoropropionyl (PFP) derivatives for GC-MS analysis using pentafluoropropionic anhydride (PFPA) in ethyl acetate (30 min, 65 °C). d0Me-AA-PFP and d3Me-AA-PFP derivatives of AA are readily extractable into water-immiscible, GC-compatible organic solvents such as toluene. d0Me-AA-PFP and d3Me-AA-PFP derivatives are stable in toluene extracts for several weeks, thus enabling high throughput quantitative measurement of biological AA by GC-MS using in situ prepared d3Me-AA as internal standards in OMICS format. Here, we describe the development of a novel OMICS-compatible QC system and demonstrate its utility for the quality control of quantitative analysis of 21 free AA and metabolites in human plasma samples by GC-MS as Me-PFP derivatives. The QC system involves cross-standardization of the concentrations of the AA in their aqueous solutions at four concentration levels and a quantitative control of AA at the same four concentration levels in pooled human plasma samples. The retention time (tR)-based isotope effects (IE) and the difference (δ(H/D) of the retention times of the d0Me-AA-PFP derivatives (tR(H)) and the d3Me-AA-PFP derivatives (tR(D)) were determined in study human plasma samples of a nutritional study (n = 353) and in co-processed QC human plasma samples (n = 64). In total, more than 400 plasma samples were measured in eight runs in seven working days performed by a single person. The proposed QC system provides information about the quantitative performance of the GC-MS analysis of AA in human plasma. IE, δ(H/D) and a massive drop of the peak area values of the d3Me-AA-PFP derivatives may be suitable as additional parameters of qualitative analysis in targeted GC-MS amino acid-OMICS.
    Keywords:  OMICS; amino acids; metabolites; plasma; quality control; sample preparation
  13. Biomedicines. 2023 Aug 23. pii: 2356. [Epub ahead of print]11(9):
      Molnupiravir is an antiviral drug against viral RNA polymerase activity approved by the FDA for the treatment of COVID-19, which is metabolized to β-D-N4-hydroxycytidine (NHC) in human blood plasma. A novel method was developed and validated for quantifying NHC in human plasma within the analytical range of 10-10,000 ng/mL using high-performance liquid chromatography with tandem mass spectrometry (HPLC-MS/MS) to support pharmacokinetics studies. For sample preparation, the method of protein precipitation by acetonitrile was used, with promethazine as an internal standard. Chromatographic separation was carried out on a Shim-pack GWS C18 (150 mm × 4.6 mm, 5 μm) column in a gradient elution mode. A 0.1% formic acid solution in water with 0.08% ammonia solution (eluent A, v/v) and 0.1% formic acid solution in methanol with 0.08% ammonia solution mixed with acetonitrile in a 4:1 ratio (eluent B, v/v) were used as a mobile phase. Electrospray ionization (ESI) was used as an ionization source. The developed method was validated in accordance with the Eurasian Economic Union (EAEU) rules, based on the European Medicines Agency (EMA) and Food and Drug Administration (FDA) guidelines for the following parameters and used within the analytical part of the clinical study of molnupiravir drugs: selectivity, suitability of standard sample, matrix effect, calibration curve, accuracy, precision, recovery, lower limit of quantification (LLOQ), carryover, and stability.
    Keywords:  COVID-19; HPLC-MS/MS; NHC; molnupiravir; pharmacokinetics; plasma; validation; β-D-N4-hydroxycytidine
  14. Anal Bioanal Chem. 2023 Sep 23.
      Over the last decade, applications of ion mobility-mass spectrometry (IM-MS) have exploded due primarily to the widespread commercialization of robust instrumentation from several vendors. Unfortunately, the modest resolving power of many of these platforms (~40-60) has precluded routine separation of constitutional and stereochemical isomers. While instrumentation advances have pushed resolving power to >150 in some cases, chemical approaches offer an alternative for increasing resolution with existing IM-MS instrumentation. Herein we explore the utility of two reactions, derivatization by Girard's reagents and 1,1-carbonyldiimidazole (CDI), for improving IM separation of steroid hormone isomers. These reactions are fast (≤30 min), simple (requiring only basic lab equipment/expertise), and low-cost. Notably, these reactions are structurally selective in that they target carbonyl and hydroxyl groups, respectively, which are found in all naturally occurring steroids. Many steroid hormone isomers differ only in the number, location, and/or stereochemistry of these functional groups, allowing these reactions to "amplify" subtle structural differences and improve IM resolution. Our results show that resolution was significantly improved amongst CDI-derivatized isomer groups of hydroxyprogesterone (two-peak resolution of Rpp = 1.10 between 21-OHP and 11B-OHP), deoxycortisone (Rpp = 1.47 between 11-DHC and 21-DOC), and desoximetasone (Rpp = 1.98 between desoximetasone and fluocortolone). Moreover, characteristic collision cross section (DTCCSN2) measurements can be used to increase confidence in the identification of these compounds in complex biological mixtures. To demonstrate the feasibility of analyzing the derivatized steroids in complex biological matrixes, the reactions were performed following steroid extraction from urine and yielded similar results. Additionally, we applied a software-based approach (high-resolution demultiplexing) that further improved the resolving power (>150). Overall, our results suggest that targeted derivatization reactions coupled with IM-MS can significantly improve the resolution of challenging isomer groups, allowing for more accurate and efficient analysis of complex mixtures.
    Keywords:  Demultiplexing; Derivatization reactions; Ion mobility-mass spectrometry (IM-MS); Steroid hormones
  15. Biomed Chromatogr. 2023 Sep 26. e5754
      Despite aggressive treatment approaches, the overall survival of glioblastoma (GBM) patients remained poor with a strong need for more effective chemotherapeutic agents. A previous study has shown that ARN14988 is more cytotoxic to GBM cells compared to US Food and Drug Administration-approved temozolomide. This finding makes ARN14988 a desirable candidate for further pharmacological assessment. Therefore, an efficient analytical method is needed to quantify ARN14988. Herein, we have developed and validated sample preparation and LC-MS/MS triple quadrupole (QQQ) method for quantification of ARN14988 in mouse plasma. In this method, the liquid-liquid extraction of ARN14988 from mouse plasma was performed using 5% ethyl acetate in hexane. The chromatographic separation was achieved using a C18 -column with mobile phases of 10 mm ammonium acetate (pH 5) and 0.1% formic acid in methanol, within a runtime of 10 min. The monitored transitions were m/z 391.20 → m/z 147.00 for ARN14988, and m/z 455.30 → m/z 165.00 for verapamil (internal standard) in positive electrospray ionization. The developed method for ARN14988 showed linearity over the range of 10-5,000 ng/ml (r2  > 0.99). The selectivity, sensitivity, matrix effect, recovery, stability, inter-day and intraday accuracy and precision were determined using four quality control samples. This validated method was successfully applied to the pharmacokinetic study of ARN14988 in mice.
    Keywords:  2-methylpropyl ester; 5-chloro-3-[(hexylamino)carbonyl]-3,6-dihydro-2,6-dioxo-1(2H)-pyrimidinecarboxylic acid; ARN14988; acid ceramidase inhibitor; glioblastoma; liquid chromatography-tandem mass spectrometry; pharmacokinetics
  16. Biomolecules. 2023 09 04. pii: 1343. [Epub ahead of print]13(9):
      Generative modeling and representation learning of tandem mass spectrometry data aim to learn an interpretable and instrument-agnostic digital representation of metabolites directly from MS/MS spectra. Interpretable and instrument-agnostic digital representations would facilitate comparisons of MS/MS spectra between instrument vendors and enable better and more accurate queries of large MS/MS spectra databases for metabolite identification. In this study, we apply generative modeling and representation learning using variational autoencoders to understand the extent to which tandem mass spectra can be disentangled into their factors of generation (e.g., collision energy, ionization mode, instrument type, etc.) with minimal prior knowledge of the factors. We find that variational autoencoders can disentangle tandem mass spectra data with the proper choice of hyperparameters into meaningful latent representations aligned with known factors of variation. We develop a two-step approach to facilitate the selection of models that are disentangled, which could be applied to other complex and high-dimensional data sets.
    Keywords:  deep learning; disentangled representation; generative models; latent space; tandem mass spectrometry; variational autoencoder