bims-arihec Biomed News
on Artificial Intelligence in Healthcare
Issue of 2019‒12‒15
fifteen papers selected by
Céline Bélanger
Cogniges Inc.


  1. Intensive Care Med Exp. 2019 Dec 10. 7(1): 70
      BACKGROUND: Prognosticating the course of diseases to inform decision-making is a key component of intensive care medicine. For several applications in medicine, new methods from the field of artificial intelligence (AI) and machine learning have already outperformed conventional prediction models. Due to their technical characteristics, these methods will present new ethical challenges to the intensivist.RESULTS: In addition to the standards of data stewardship in medicine, the selection of datasets and algorithms to create AI prognostication models must involve extensive scrutiny to avoid biases and, consequently, injustice against individuals or groups of patients. Assessment of these models for compliance with the ethical principles of beneficence and non-maleficence should also include quantification of predictive uncertainty. Respect for patients' autonomy during decision-making requires transparency of the data processing by AI models to explain the predictions derived from these models. Moreover, a system of continuous oversight can help to maintain public trust in this technology. Based on these considerations as well as recent guidelines, we propose a pathway to an ethical implementation of AI-based prognostication. It includes a checklist for new AI models that deals with medical and technical topics as well as patient- and system-centered issues.
    CONCLUSION: AI models for prognostication will become valuable tools in intensive care. However, they require technical refinement and a careful implementation according to the standards of medical ethics.
    Keywords:  Artificial intelligence; Intensive care; Machine learning; Medical ethics; Prognostication
    DOI:  https://doi.org/10.1186/s40635-019-0286-6
  2. Eur Radiol. 2019 Dec 10.
      OBJECTIVES: To evaluate the diagnostic performance of a deep learning algorithm for automated detection of small 18F-FDG-avid pulmonary nodules in PET scans, and to assess whether novel block sequential regularized expectation maximization (BSREM) reconstruction affects detection accuracy as compared to ordered subset expectation maximization (OSEM) reconstruction.METHODS: Fifty-seven patients with 92 18F-FDG-avid pulmonary nodules (all ≤ 2 cm) undergoing PET/CT for oncological (re-)staging were retrospectively included and a total of 8824 PET images of the lungs were extracted using OSEM and BSREM reconstruction. Per-slice and per-nodule sensitivity of a deep learning algorithm was assessed, with an expert readout by a radiologist/nuclear medicine physician serving as standard of reference. Receiver-operator characteristic (ROC) curve of OSEM and BSREM were assessed and the areas under the ROC curve (AUC) were compared. A maximum standardized uptake value (SUVmax)-based sensitivity analysis and a size-based sensitivity analysis with subgroups defined by nodule size was performed.
    RESULTS: The AUC of the deep learning algorithm for nodule detection using OSEM reconstruction was 0.796 (CI 95%; 0.772-0.869), and 0.848 (CI 95%; 0.828-0.869) using BSREM reconstruction. The AUC was significantly higher for BSREM compared to OSEM (p = 0.001). On a per-slice analysis, sensitivity and specificity were 66.7% and 79.0% for OSEM, and 69.2% and 84.5% for BSREM. On a per-nodule analysis, the overall sensitivity of OSEM was 81.5% compared to 87.0% for BSREM.
    CONCLUSIONS: Our results suggest that machine learning algorithms may aid detection of small 18F-FDG-avid pulmonary nodules in clinical PET/CT. AI performed significantly better on images with BSREM than OSEM.
    KEY POINTS: • The diagnostic value of deep learning for detecting small lung nodules (≤ 2 cm) in PET images using BSREM and OSEM reconstruction was assessed. • BSREM yields higher SUV maxof small pulmonary nodules as compared to OSEM reconstruction. • The use of BSREM translates into a higher detectability of small pulmonary nodules in PET images as assessed with artificial intelligence.
    Keywords:  Artificial intelligence; Deep learning; Diagnostic imaging; Neoplasm Metastasis; Positron-emission tomography
    DOI:  https://doi.org/10.1007/s00330-019-06498-w
  3. J Med Syst. 2019 Dec 10. 44(1): 20
      We conducted a systematic review of literature to better understand the role of new technologies in the perioperative period; in particular we focus on the administrative and managerial Operating Room (OR) perspective. Studies conducted on adult (≥ 18 years) patients between 2015 and February 2019 were deemed eligible. A total of 19 papers were included. Our review suggests that the use of Machine Learning (ML) in the field of OR organization has many potentials. Predictions of the surgical case duration were obtain with a good performance; their use could therefore allow a more precise scheduling, limiting waste of resources. ML is able to support even more complex models, which can coordinate multiple spaces simultaneously, as in the case of the post-anesthesia care unit and operating rooms. Types of Artificial Intelligence could also be used to limit another organizational problem, which has important economic repercussions: cancellation. Random Forest has proven effective in identifing surgeries with high risks of cancellation, allowing to plan preventive measures to reduce the cancellation rate accordingly. In conclusion, although data in literature are still limited, we believe that ML has great potential in the field of OR organization; however, further studies are needed to assess the effective role of these new technologies in the perioperative medicine.
    Keywords:  Anesthesia; Artificial intelligence; Big data; Block time; Hospital administration; Machine learning; Operating room; Operating room efficiency; Perioperative; Recovery room; Robotic assisted surgery; Scheduling
    DOI:  https://doi.org/10.1007/s10916-019-1512-1
  4. Minerva Urol Nefrol. 2019 Dec 12.
      INTRODUCTION: As we enter the era of "big data", an increasing amount of complex health- care data will become available. These data are often redundant, "noisy", and characterized by wide variability. In order to offer a precise and transversal view of a clinical scenario the Artificial Intelligence (AI) with Machine learning (ML) algorithms and Artificial neuron networks (ANNs) process were adopted, with a promising wide diffusion in the near future. The present work aims to provide a comprehensive and critical overview of the current and potential applications of AI and ANNs in Urology.EVIDENCE ACQUISITION: A non-systematic review of the literature was performed by screening Medline, PubMed, the Cochrane Database, and Embase to detect pertinent studies regarding the application of AI and ANN in Urology.
    EVIDENCE SYNTHESIS: The main application of AI in urology is the field of genitourinary cancers. Focusing on prostate cancer, AI was applied for the prediction of prostate biopsy results. For bladder cancer, the prediction of recurrence-free probability and diagnostic evaluation were analysed with ML algorithms. For kidney and testis cancer, anecdotal experiences were reported for staging and prediction of diseases recurrence. More recently, AI has been applied in non- oncological diseases like stones and functional urology.
    CONCLUSIONS: AI technologies are growing their role in health care; but, up to now, their "real-life" implementation remains limited. However, in the near future, the potential of AI-driven era could change the clinical practice in Urology, improving overall patient outcomes.
    DOI:  https://doi.org/10.23736/S0393-2249.19.03613-0
  5. BMJ Open. 2019 Dec 11. 9(12): e030482
      INTRODUCTION: Infants can experience pain similar to adults, and improperly controlled pain stimuli could have a long-term adverse impact on their cognitive and neurological function development. The biggest challenge of achieving good infant pain control is obtaining objective pain assessment when direct communication is lacking. For years, computer scientists have developed many different facial expression-centred machine learning (ML) methods for automatic infant pain assessment. Many of these ML algorithms showed rather satisfactory performance and have demonstrated good potential to be further enhanced for implementation in real-world clinical settings. To date, there is no prior research that has systematically summarised and compared the performance of these ML algorithms. Our proposed meta-analysis will provide the first comprehensive evidence on this topic to guide further ML algorithm development and clinical implementation.METHODS AND ANALYSIS: We will search four major public electronic medical and computer science databases including Web of Science, PubMed, Embase and IEEE Xplore Digital Library from January 2008 to present. All the articles will be imported into the Covidence platform for study eligibility screening and inclusion. Study-level extracted data will be stored in the Systematic Review Data Repository online platform. The primary outcome will be the prediction accuracy of the ML model. The secondary outcomes will be model utility measures including generalisability, interpretability and computational efficiency. All extracted outcome data will be imported into RevMan V.5.2.1 software and R V3.3.2 for analysis. Risk of bias will be summarised using the latest Prediction Model Study Risk of Bias Assessment Tool.
    ETHICS AND DISSEMINATION: This systematic review and meta-analysis will only use study-level data from public databases, thus formal ethical approval is not required. The results will be disseminated in the form of an official publication in a peer-reviewed journal and/or presentation at relevant conferences.
    PROSPERO REGISTRATION NUMBER: CRD42019118784.
    Keywords:  artificial intelligence; facial expression; infant; machine learning; pain
    DOI:  https://doi.org/10.1136/bmjopen-2019-030482
  6. Artif Intell Med. 2019 Nov;pii: S0933-3657(19)30072-7. [Epub ahead of print]101 101708
      Metabolic Syndrome (MetS) is associated with the risk of developing chronic disease (atherosclerotic cardiovascular disease, type 2 diabetes, cancers and chronic kidney disease) and has an important role in early prevention. Previous research showed that an artificial neural network (ANN) is a suitable tool for algorithmic MetS diagnostics, that includes solely non-invasive, low-cost and easily-obtainabled (NI&LC&EO) diagnostic methods. This paper considers using four well-known machine learning methods (linear regression, artificial neural network, decision tree and random forest) for MetS predictions and provides their comparison, in order to induce and facilitate development of appropriate medical software by using these methods. Training, validation and testing are conducted on the large dataset that includes 3000 persons. Input vectors are very simple and contain the following parameters: gender, age, body mass index, waist-to-height ratio, systolic and diastolic blood pressures, while the output is MetS diagnosis in true/false form, made in accordance with International Diabetes Federation (IDF). Comparison leads to the conclusion that random forest achieves the highest specificity (SPC=0.9436), sensitivity (SNS=0.9154), positive (PPV=0.9379) and negative (NPV=0.9150) predictive values. Algorithmic diagnosis of MetS could be beneficial in everyday clinical practice since it can easily identify high risk patients.
    Keywords:  Artificial neural network; Decision tree; Linear regression; Metabolic syndrome; Random forest
    DOI:  https://doi.org/10.1016/j.artmed.2019.101708
  7. JAMA Surg. 2019 Dec 11.
      Importance: Surgeons make complex, high-stakes decisions under time constraints and uncertainty, with significant effect on patient outcomes. This review describes the weaknesses of traditional clinical decision-support systems and proposes that artificial intelligence should be used to augment surgical decision-making.Observations: Surgical decision-making is dominated by hypothetical-deductive reasoning, individual judgment, and heuristics. These factors can lead to bias, error, and preventable harm. Traditional predictive analytics and clinical decision-support systems are intended to augment surgical decision-making, but their clinical utility is compromised by time-consuming manual data management and suboptimal accuracy. These challenges can be overcome by automated artificial intelligence models fed by livestreaming electronic health record data with mobile device outputs. This approach would require data standardization, advances in model interpretability, careful implementation and monitoring, attention to ethical challenges involving algorithm bias and accountability for errors, and preservation of bedside assessment and human intuition in the decision-making process.
    Conclusions and Relevance: Integration of artificial intelligence with surgical decision-making has the potential to transform care by augmenting the decision to operate, informed consent process, identification and mitigation of modifiable risk factors, decisions regarding postoperative management, and shared decisions regarding resource use.
    DOI:  https://doi.org/10.1001/jamasurg.2019.4917
  8. J Neurosurg. 2019 Dec 06. pii: 2019.9.JNS191949. [Epub ahead of print] 1-9
      OBJECTIVE: Automatic segmentation of vestibular schwannomas (VSs) from MRI could significantly improve clinical workflow and assist in patient management. Accurate tumor segmentation and volumetric measurements provide the best indicators to detect subtle VS growth, but current techniques are labor intensive and dedicated software is not readily available within the clinical setting. The authors aim to develop a novel artificial intelligence (AI) framework to be embedded in the clinical routine for automatic delineation and volumetry of VS.METHODS: Imaging data (contrast-enhanced T1-weighted [ceT1] and high-resolution T2-weighted [hrT2] MR images) from all patients meeting the study's inclusion/exclusion criteria who had a single sporadic VS treated with Gamma Knife stereotactic radiosurgery were used to create a model. The authors developed a novel AI framework based on a 2.5D convolutional neural network (CNN) to exploit the different in-plane and through-plane resolutions encountered in standard clinical imaging protocols. They used a computational attention module to enable the CNN to focus on the small VS target and propose a supervision on the attention map for more accurate segmentation. The manually segmented target tumor volume (also tested for interobserver variability) was used as the ground truth for training and evaluation of the CNN. We quantitatively measured the Dice score, average symmetric surface distance (ASSD), and relative volume error (RVE) of the automatic segmentation results in comparison to manual segmentations to assess the model's accuracy.
    RESULTS: Imaging data from all eligible patients (n = 243) were randomly split into 3 nonoverlapping groups for training (n = 177), hyperparameter tuning (n = 20), and testing (n = 46). Dice, ASSD, and RVE scores were measured on the testing set for the respective input data types as follows: ceT1 93.43%, 0.203 mm, 6.96%; hrT2 88.25%, 0.416 mm, 9.77%; combined ceT1/hrT2 93.68%, 0.199 mm, 7.03%. Given a margin of 5% for the Dice score, the automated method was shown to achieve statistically equivalent performance in comparison to an annotator using ceT1 images alone (p = 4e-13) and combined ceT1/hrT2 images (p = 7e-18) as inputs.
    CONCLUSIONS: The authors developed a robust AI framework for automatically delineating and calculating VS tumor volume and have achieved excellent results, equivalent to those achieved by an independent human annotator. This promising AI technology has the potential to improve the management of patients with VS and potentially other brain tumors.
    Keywords:  AI = artificial intelligence; ASSD = average symmetric surface distance; CNN = convolutional neural network; DL = deep learning; GK = Gamma Knife; HDL = hardness-weighted Dice loss; MRI; RVE = relative volume error; SRS = stereotactic radiosurgery; SpvA = supervised attention module; VS = vestibular schwannoma; artificial intelligence; ceT1 = contrast-enhanced T1-weighted; convolutional neural network; hrT2 = high-resolution T2-weighted; oncology; segmentation; tumor; vestibular schwannoma
    DOI:  https://doi.org/10.3171/2019.9.JNS191949
  9. J Clin Hypertens (Greenwich). 2019 Dec 09.
      Hypertension is a significant public health issue. The ability to predict the risk of developing hypertension could contribute to disease prevention strategies. This study used machine learning techniques to develop and validate a new risk prediction model for new-onset hypertension. In Japan, Industrial Safety and Health Law requires employers to provide annual health checkups to their employees. We used 2005-2016 health checkup data from 18 258 individuals, at the time of hypertension diagnosis [Year (0)] and in the two previous annual visits [Year (-1) and Year (-2)]. Data were entered into models based on machine learning methods (XGBoost and ensemble) or traditional statistical methods (logistic regression). Data were randomly split into a derivation set (75%, n = 13 694) used for model construction and development, and a validation set (25%, n = 4564) used to test performance of the derived models. The best predictor in the XGBoost model was systolic blood pressure during cardio-ankle vascular index measurement at Year (-1). Area under the receiver operator characteristic curve values in the validation cohort were 0.877, 0.881, and 0.859 for the XGBoost, ensemble, and logistic regression models, respectively. We have developed a highly precise prediction model for future hypertension using machine learning methods in a general normotensive population. This could be used to identify at-risk individuals and facilitate earlier non-pharmacological intervention to prevent the future development of hypertension.
    Keywords:  artificial intelligence; hypertension; machine learning; prediction model
    DOI:  https://doi.org/10.1111/jch.13759
  10. Acad Radiol. 2020 Jan;pii: S1076-6332(19)30442-8. [Epub ahead of print]27(1): 117-120
      The AUR Academic Radiology and Industry Leaders Roundtable was organized as an open discussion between academic leaders of top US academic radiology departments and industry leaders from top companies that provide equipment and services to radiology, including manufacturers, pharmaceutical companies, software developers and electronic medical record (EMR) providers. The format was that of a structured brainstorming session with pre-selected discussion topics. This roundtable was instrumental in widening perspectives and providing insights into the challenges and opportunities for our specialty, such as in the case of Artificial Intelligence (AI).
    Keywords:  academic radiology; academic-industry partnerships; artificial intelligence
    DOI:  https://doi.org/10.1016/j.acra.2019.07.031
  11. J Dent. 2019 Dec 07. pii: S0300-5712(19)30270-2. [Epub ahead of print] 103260
      OBJECTIVES: In this pilot study, we applied deep convolutional neural networks (CNNs) to detect caries lesions in Near-Infrared-Light Transillumination (NILT) images.METHODS: 226 extracted posterior permanent human teeth (113 premolars, 113 molars) were allocated to groups of 2 + 2 teeth, and mounted in a pilot-tested diagnostic model in a dummy head. NILT images of single-tooth-segments were generated using DIAGNOcam (KaVo, Biberach). For each segment (on average 435 × 407 × 3 pixels), occlusal and/or proximal caries lesions were annotated by two experienced dentists using an in-house developed digital annotation tool. The pixel-based annotations were translated into binary class levels. We trained two state-of-the-art CNNs (Resnet18, Resnext50) and validated them via 10-fold cross validation. During the training process, we applied data augmentation (random resizing, rotations and flipping) and one-cycle-learning rate policy, setting the minimum and maximum learning rates to 10-5 and 10-3, respectively. Metrics for model performance were the area-under-the-receiver-operating-characteristics-curve (AUC), sensitivity, specificity, and positive/negative predictive values (PPV/NPV). Feature visualization was additionally applied to assess if the CNNs built on features dentists would also use.
    RESULTS: The tooth-level prevalence of caries lesions was 41%. The two models performed similar on predicting caries on tooth segments of NILT images. The marginal better model with respect to AUC was Resnext50, where we retrained the last 9 network layers, using the Adam optimizer, a learning rate of 0.5 × 10-4, and a batch size of 10. The mean (95% CI) AUC was 0.74 (0.66-0.82). Sensitivity and specificity were 0.59 (0.47-0.70) and 0.76 (0.68-0.84) respectively. The resulting PPV was 0.63 (0.51-0.74), the NPV 0.73 (0.65-0.80). Visual inspection of model predictions found the model to be sensitive to areas affected by caries lesions.
    CONCLUSIONS: A moderately deep CNN trained on a limited amount of NILT image data showed satisfying discriminatory ability to detect caries lesions.
    CLINICAL SIGNIFICANCE: CNNs may be useful to assist NILT-based caries detection. This could be especially relevant in non-conventional dental settings, like schools, care homes or rural outpost centers.
    Keywords:  Artificial Intelligence; Caries; Diagnostics; Digital imaging/radiology; Mathematical modeling
    DOI:  https://doi.org/10.1016/j.jdent.2019.103260
  12. Abdom Radiol (NY). 2019 Dec 10.
      PURPOSE: To predict the histologic grade of small clear cell renal cell carcinomas (ccRCCs) using texture analysis and machine learning algorithms.METHODS: Fifty-two noncontrast (NC), 26 corticomedullary (CM) phase, and 35 nephrographic (NG) phase CTs of small (< 4 cm) surgically resected ccRCCs were retrospectively identified. Surgical pathology classified the tumors as low- or high-Fuhrman histologic grade. The axial image with the largest cross-sectional tumor area was exported and segmented. Six histogram and 31 texture (gray-level co-occurrences (GLC) and gray-level run-lengths (GLRL)) features were calculated for each tumor in each phase. T testing compared feature values in low- and high-grade ccRCCs, with a (Benjamini-Hochberg) false discovery rate of 10%. Area under the receiver operating curve (AUC) was calculated for each feature to assess prediction of low- and high-grade ccRCCs in each phase. Histogram, texture, and combined histogram and texture data sets were used to train and test four algorithms (k-nearest neighbor (KNN), support vector machine (SVM), random forests, and decision tree) with tenfold cross-validation; AUCs were calculated for each algorithm in each phase to assess prediction of low- and high-grade ccRCCs.
    RESULTS: Zero, 23, and 0 features in the NC, CM, and NG phases had statistically significant differences between low and high-grade ccRCCs. CM histogram skewness and GLRL short run emphasis had the highest AUCs (0.82) in predicting histologic grade. All four algorithms had the highest AUCs (0.97) predicting histologic grade using CM histogram features. The algorithms' AUCs decreased using histogram or texture features from NC or NG phases.
    CONCLUSION: The histologic grade of small ccRCCs can be accurately predicted with machine learning algorithms using CM histogram features, which outperform NC and NG phase image data.
    Keywords:  Clear cell renal cell carcinoma; Histology; Machine learning; Texture
    DOI:  https://doi.org/10.1007/s00261-019-02336-1
  13. JAMA Netw Open. 2019 Dec 02. 2(12): e1917221
      Importance: Inpatient overcrowding is associated with delays in care, including the deferral of surgical care until beds are available to accommodate postoperative patients. Timely patient discharge is critical to address inpatient overcrowding and requires coordination among surgeons, nurses, case managers, and others. This is difficult to achieve without early identification and systemwide transparency of discharge candidates and their respective barriers to discharge.Objective: To validate the performance of a clinically interpretable feedforward neural network model that could improve the discharge process by predicting which patients would be discharged within 24 hours and their clinical and nonclinical barriers.
    Design, Setting, and Participants: This prognostic study included adult patients discharged from inpatient surgical care from May 1, 2016, to August 31, 2017, at a quaternary care teaching hospital. Model performance was assessed with standard cross-validation techniques. The model's performance was compared with a baseline model using historical procedure median length of stay to predict discharges. In prospective cohort analysis, the feedforward neural network model was used to make predictions on general surgical care floors with 63 beds. If patients were not discharged when predicted, the causes of delay were recorded.
    Main Outcomes and Measures: The primary outcome was the out-of-sample area under the receiver operating characteristic curve of the model. Secondary outcomes included the causes of discharge delay and the number of avoidable bed-days.
    Results: The model was trained on 15 201 patients (median [interquartile range] age, 60 [46-70] years; 7623 [50.1%] men) discharged from inpatient surgical care. The estimated out-of-sample area under the receiver operating characteristic curve of the model was 0.840 (SD, 0.008; 95% CI, 0.839-0.844). Compared with the baseline model, the neural network model had higher sensitivity (52.5% vs 56.6%) and specificity (51.7% vs 82.6%). The neural network model identified 65 barriers to discharge. In the prospective study of 605 patients, causes of delays included clinical barriers (41 patients [30.1%]), variation in clinical practice (30 patients [22.1%]), and nonclinical reasons (65 patients [47.8%]). Summing patients who were not discharged owing to variation in clinical practice and nonclinical reasons, 128 bed-days, or 1.2 beds per day, were classified as avoidable.
    Conclusions and Relevance: This cohort study found that a neural network model could predict daily inpatient surgical care discharges and their barriers. The model identified systemic causes of discharge delays. Such models should be studied for their ability to increase the timeliness of discharges.
    DOI:  https://doi.org/10.1001/jamanetworkopen.2019.17221
  14. PLoS One. 2019 ;14(12): e0226518
      BACKGROUND: The triage of patients in prehospital care is a difficult task, and improved risk assessment tools are needed both at the dispatch center and on the ambulance to differentiate between low- and high-risk patients. This study validates a machine learning-based approach to generating risk scores based on hospital outcomes using routinely collected prehospital data.METHODS: Dispatch, ambulance, and hospital data were collected in one Swedish region from 2016-2017. Dispatch center and ambulance records were used to develop gradient boosting models predicting hospital admission, critical care (defined as admission to an intensive care unit or in-hospital mortality), and two-day mortality. Composite risk scores were generated based on the models and compared to National Early Warning Scores (NEWS) and actual dispatched priorities in a prospectively gathered dataset from 2018.
    RESULTS: A total of 38203 patients were included from 2016-2018. Concordance indexes (or areas under the receiver operating characteristics curve) for dispatched priorities ranged from 0.51-0.66, while those for NEWS ranged from 0.66-0.85. Concordance ranged from 0.70-0.79 for risk scores based only on dispatch data, and 0.79-0.89 for risk scores including ambulance data. Dispatch data-based risk scores consistently outperformed dispatched priorities in predicting hospital outcomes, while models including ambulance data also consistently outperformed NEWS. Model performance in the prospective test dataset was similar to that found using cross-validation, and calibration was comparable to that of NEWS.
    CONCLUSIONS: Machine learning-based risk scores outperformed a widely-used rule-based triage algorithm and human prioritization decisions in predicting hospital outcomes. Performance was robust in a prospectively gathered dataset, and scores demonstrated adequate calibration. Future research should explore the robustness of these methods when applied to other settings, establish appropriate outcome measures for use in determining the need for prehospital care, and investigate the clinical impact of interventions based on these methods.
    DOI:  https://doi.org/10.1371/journal.pone.0226518
  15. Diagnostics (Basel). 2019 Dec 11. pii: E219. [Epub ahead of print]9(4):
      (1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencing technology now allow us to study the genomic profiles of patients in association with their different Gleason scores more accurately and effectively. (2) Methods: In this study, we used a novel machine learning method to analyse gene expression of prostate tumours with different Gleason scores, and identify potential genetic biomarkers for each Gleason group. We obtained a publicly-available RNA-Seq dataset of a cohort of 104 prostate cancer patients from the National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository, and categorised patients based on their Gleason scores to create a hierarchy of disease progression. A hierarchical model with standard classifiers in different Gleason groups, also known as nodes, was developed to identify and predict nodes based on their mRNA or gene expression. In each node, patient samples were analysed via class imbalance and hybrid feature selection techniques to build the prediction model. The outcome from analysis of each node was a set of genes that could differentiate each Gleason group from the remaining groups. To validate the proposed method, the set of identified genes were used to classify a second dataset of 499 prostate cancer patients collected from cBioportal. (3) Results: The overall accuracy of applying this novel method to the first dataset was 93.3%; the method was further validated to have 87% accuracy using the second dataset. This method also identified genes that were not previously reported as potential biomarkers for specific Gleason groups. In particular, PIAS3 was identified as a potential biomarker for Gleason score 4 + 3 = 7, and UBE2V2 for Gleason score 6. (4) Insight: Previous reports show that the genes predicted by this newly proposed method strongly correlate with prostate cancer development and progression. Furthermore, pathway analysis shows that both PIAS3 and UBE2V2 share similar protein interaction pathways, the JAK/STAT signaling process.
    Keywords:  Gleason score detection; classification; next generation sequencing; prostate cancer; supervised learning; transcriptomics
    DOI:  https://doi.org/10.3390/diagnostics9040219