Diagnostics (Basel). 2025 May 01. pii: 1156. [Epub ahead of print]15(9):
Background and Objectives: The accurate discrimination between patients with and without cancer using their cell-free DNA (cfDNA) is crucial for early cancer diagnosis. The end-motifs of cfDNA serve as significant cancer biomarkers, offering compelling prospects for cancer diagnosis. This study proposes EM-DeepSD, a signal decomposition deep learning framework based on cfDNA end-motifs, which is aimed at improving the accuracy of cancer diagnosis and adapting to different sequencing modalities. Materials and Methods: This study included 146 patients diagnosed with cancer and 122 non-cancer controls. EM-DeepSD comprises three core modules. Initially, it utilizes a signal decomposition module to decompose and reconstruct the input end-motif profiles, thereby generating multiple regular subsequences that optimize the subsequent modeling process. Subsequently, both a machine learning module and a deep learning module are employed to improve the accuracy of cancer diagnosis. Furthermore, this paper compares the performance of EM-DeepSD with that of existing benchmarked methods to demonstrate its superiority. Based on the EM-DeepSD framework, we developed the EM-DeepSSA model and compared it with two benchmarked methods across different cfDNA sequencing datasets. Results: In the internal validation set, EM-DeepSSA outperformed the two benchmark methods for cancer diagnosis (area under the curve (AUC), 0.920; adjusted p value < 0.05). Meanwhile, EM-DeepSSA also exhibited the best performance on two independent external testing sets that were subjected to 5-hydroxymethylcytosine sequencing (5hmCS) and broad-range cell-free DNA sequencing (BR-cfDNA-Seq), respectively (test set-1: AUC = 0.933; test set-2: AUC = 0.956; adjusted p value < 0.05). Conclusions: In summary, we present a new framework which can achieve high classification performance in cancer diagnosis and which is applicable to different sequencing modalities.
Keywords: cancer diagnosis; cell-free DNA; deep neural network; signal decomposition