bims-arihec Biomed News
on Artificial intelligence in healthcare
Issue of 2020–02–16
twenty-one papers selected by
Céline Bélanger, Cogniges Inc.



  1. Radiol Bras. 2019 Nov-Dec;52(6):52(6): 387-396
      The discipline of radiology and diagnostic imaging has evolved greatly in recent years. We have observed an exponential increase in the number of exams performed, subspecialization of medical fields, and increases in accuracy of the various imaging methods, making it a challenge for the radiologist to "know everything about all exams and regions". In addition, imaging exams are no longer only qualitative and diagnostic, providing now quantitative information on disease severity, as well as identifying biomarkers of prognosis and treatment response. In view of this, computer-aided diagnosis systems have been developed with the objective of complementing diagnostic imaging and helping the therapeutic decision-making process. With the advent of artificial intelligence, "big data", and machine learning, we are moving toward the rapid expansion of the use of these tools in daily life of physicians, making each patient unique, as well as leading radiology toward the concept of multidisciplinary approach and precision medicine. In this article, we will present the main aspects of the computational tools currently available for analysis of images and the principles of such analysis, together with the main terms and concepts involved, as well as examining the impact that the development of artificial intelligence has had on radiology and diagnostic imaging.
    Keywords:  Artificial intelligence; Computer aided diagnosis; Machine learning; Radiomics
    DOI:  https://doi.org/10.1590/0100-3984.2019.0049
  2. Crit Care Med. 2020 Feb 11.
       OBJECTIVES: As the performance of a conventional track and trigger system in a rapid response system has been unsatisfactory, we developed and implemented an artificial intelligence for predicting in-hospital cardiac arrest, denoted the deep learning-based early warning system. The purpose of this study was to compare the performance of an artificial intelligence-based early warning system with that of conventional methods in a real hospital situation.
    DESIGN: Retrospective cohort study.
    SETTING: This study was conducted at a hospital in which deep learning-based early warning system was implemented.
    PATIENTS: We reviewed the records of adult patients who were admitted to the general ward of our hospital from April 2018 to March 2019.
    INTERVENTIONS: The study population included 8,039 adult patients. A total 83 events of deterioration occurred during the study period. The outcome was events of deterioration, defined as cardiac arrest and unexpected ICU admission. We defined a true alarm as an alarm occurring within 0.5-24 hours before a deteriorating event.
    MEASUREMENTS AND MAIN RESULTS: We used the area under the receiver operating characteristic curve, area under the precision-recall curve, number needed to examine, and mean alarm count per day as comparative measures. The deep learning-based early warning system (area under the receiver operating characteristic curve, 0.865; area under the precision-recall curve, 0.066) outperformed the modified early warning score (area under the receiver operating characteristic curve, 0.682; area under the precision-recall curve, 0.010) and reduced the number needed to examine and mean alarm count per day by 69.2% and 59.6%, respectively. At the same specificity, deep learning-based early warning system had up to 257% higher sensitivity than conventional methods.
    CONCLUSIONS: The developed artificial intelligence based on deep-learning, deep learning-based early warning system, accurately predicted deterioration of patients in a general ward and outperformed conventional methods. This study showed the potential and effectiveness of artificial intelligence in an rapid response system, which can be applied together with electronic health records. This will be a useful method to identify patients with deterioration and help with precise decision-making in daily practice.
    DOI:  https://doi.org/10.1097/CCM.0000000000004236
  3. J Magn Reson Imaging. 2020 Feb 12.
      Artificial intelligence (AI) shows tremendous promise in the field of medical imaging, with recent breakthroughs applying deep-learning models for data acquisition, classification problems, segmentation, image synthesis, and image reconstruction. With an eye towards clinical applications, we summarize the active field of deep-learning-based MR image reconstruction. We review the basic concepts of how deep-learning algorithms aid in the transformation of raw k-space data to image data, and specifically examine accelerated imaging and artifact suppression. Recent efforts in these areas show that deep-learning-based algorithms can match and, in some cases, eclipse conventional reconstruction methods in terms of image quality and computational efficiency across a host of clinical imaging applications, including musculoskeletal, abdominal, cardiac, and brain imaging. This article is an introductory overview aimed at clinical radiologists with no experience in deep-learning-based MR image reconstruction and should enable them to understand the basic concepts and current clinical applications of this rapidly growing area of research across multiple organ systems.
    Keywords:  MRI; deep learning; image reconstruction
    DOI:  https://doi.org/10.1002/jmri.27078
  4. Acad Radiol. 2020 Feb 12. pii: S1076-6332(20)30039-8. [Epub ahead of print]
      We deem a computer to exhibit artificial intelligence (AI) when it performs a task that would normally require intelligent action by a human. Much of the recent excitement about AI in the medical literature has revolved around the ability of AI models to recognize anatomy and detect pathology on medical images, sometimes at the level of expert physicians. However, AI can also be used to solve a wide range of noninterpretive problems that are relevant to radiologists and their patients. This review summarizes some of the newer noninterpretive uses of AI in radiology.
    Keywords:  Artificial intelligence; Deep learning; Radiology applications; Radiology education
    DOI:  https://doi.org/10.1016/j.acra.2020.01.012
  5. Front Cardiovasc Med. 2019 ;6 195
      Cardiovascular conditions remain the leading cause of mortality and morbidity worldwide, with genotype being a significant influence on disease risk. Cardiac imaging-genetics aims to identify and characterize the genetic variants that influence functional, physiological, and anatomical phenotypes derived from cardiovascular imaging. High-throughput DNA sequencing and genotyping have greatly accelerated genetic discovery, making variant interpretation one of the key challenges in contemporary clinical genetics. Heterogeneous, low-fidelity phenotyping and difficulties integrating and then analyzing large-scale genetic, imaging and clinical datasets using traditional statistical approaches have impeded process. Artificial intelligence (AI) methods, such as deep learning, are particularly suited to tackle the challenges of scalability and high dimensionality of data and show promise in the field of cardiac imaging-genetics. Here we review the current state of AI as applied to imaging-genetics research and discuss outstanding methodological challenges, as the field moves from pilot studies to mainstream applications, from one dimensional global descriptors to high-resolution models of whole-organ shape and function, from univariate to multivariate analysis and from candidate gene to genome-wide approaches. Finally, we consider the future directions and prospects of AI imaging-genetics for ultimately helping understand the genetic and environmental underpinnings of cardiovascular health and disease.
    Keywords:  artificial intelligence; cardiology; cardiovascular imaging; deep learning; genetics; genomics; imaging-genetics; machine learning
    DOI:  https://doi.org/10.3389/fcvm.2019.00195
  6. Front Cardiovasc Med. 2020 ;7 1
      Cardiac imaging plays an important role in the diagnosis of cardiovascular disease (CVD). Until now, its role has been limited to visual and quantitative assessment of cardiac structure and function. However, with the advent of big data and machine learning, new opportunities are emerging to build artificial intelligence tools that will directly assist the clinician in the diagnosis of CVDs. This paper presents a thorough review of recent works in this field and provide the reader with a detailed presentation of the machine learning methods that can be further exploited to enable more automated, precise and early diagnosis of most CVDs.
    Keywords:  artificial intelligence; automated diagnosis; cardiac imaging; cardiovascular disease; deep learning; machine learning; radiomics
    DOI:  https://doi.org/10.3389/fcvm.2020.00001
  7. Front Cardiovasc Med. 2019 ;6 172
      Cardiac computed tomography (CT) allows rapid visualization of the heart and coronary arteries with high spatial resolution. However, analysis of cardiac CT scans for manifestation of coronary artery disease is time-consuming and challenging. Machine learning (ML) approaches have the potential to address these challenges with high accuracy and consistent performance. In this mini review, we present a survey of the literature on ML-based analysis of coronary artery disease in cardiac CT. We summarize ML methods for detection and characterization of atherosclerotic plaque as well as anatomically and functionally significant coronary artery stenosis.
    Keywords:  atherosclerotic plaque; cardiac CT; coronary artery disease; coronary artery stenosis; machine learning
    DOI:  https://doi.org/10.3389/fcvm.2019.00172
  8. Breast Cancer. 2020 Feb 12.
       BACKGROUND: To compare the breast cancer detection performance in digital mammograms of a panel of three unaided human readers (HR) versus a stand-alone artificial intelligence (AI)-based Transpara system in a population of Japanese women.
    METHODS: The subjects were 310 Japanese female outpatients who underwent digital mammographic examinations between January 2018 and October 2018. A panel of three HR provided a Breast Imaging Reporting and Data System (BI-RADS) score, and Transpara system provided an interactive decision support score and an examination-based cancer likelihood score. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were compared under each of reading conditions.
    RESULTS: The AUC was higher for human readers than with stand-alone Transpara system (human readers 0.816; Transpara system 0.706; difference 0.11; P < 0.001). The sensitivity of the unaided HR for diagnosis was 89% and specificity was 86%. The sensitivity of stand-alone Transpara system for cutoff scores of 4 and 7 were 93% and 85%, and specificities were 45% and 67%, respectively.
    CONCLUSIONS: Although the diagnostic performance of Transpara system was statistically lower than that of HR, the recent advances in AI algorithms are expected to reduce the difference between computers and human experts in detecting breast cancer.
    Keywords:  An interactive decision support score and an examination-based cancer likelihood score; Artificial intelligence (AI); Breast cancer; Digital mammograms
    DOI:  https://doi.org/10.1007/s12282-020-01061-8
  9. J Magn Reson Imaging. 2020 Feb 11.
       BACKGROUND: Preoperative differentiation of borderline from malignant epithelial ovarian tumors (BEOT from MEOT) can impact surgical management. MRI has improved this assessment but subjective interpretation by radiologists may lead to inconsistent results.
    PURPOSE: To develop and validate an objective MRI-based machine-learning (ML) assessment model for differentiating BEOT from MEOT, and compare the performance against radiologists' interpretation.
    STUDY TYPE: Retrospective study of eight clinical centers.
    POPULATION: In all, 501 women with histopathologically-confirmed BEOT (n = 165) or MEOT (n = 336) from 2010 to 2018 were enrolled. Three cohorts were constructed: a training cohort (n = 250), an internal validation cohort (n = 92), and an external validation cohort (n = 159).
    FIELD STRENGTH/SEQUENCE: Preoperative MRI within 2 weeks of surgery. Single- and multiparameter (MP) machine-learning assessment models were built utilizing the following four MRI sequences: T2 -weighted imaging (T2 WI), fat saturation (FS), diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and contrast-enhanced (CE)-T1 WI.
    ASSESSMENT: Diagnostic performance of the models was assessed for both whole tumor (WT) and solid tumor (ST) components. Assessment of the performance of the model in discriminating BEOT vs. early-stage MEOT was made. Six radiologists of varying experience also interpreted the MR images.
    STATISTICAL TESTS: Mann-Whitney U-test: significance of the clinical characteristics; chi-square test: difference of label; DeLong test: difference of receiver operating characteristic (ROC).
    RESULTS: The MP-ST model performed better than the MP-WT model for both the internal validation cohort (area under the curve [AUC] = 0.932 vs. 0.917) and external validation cohort (AUC = 0.902 vs. 0.767). The model showed capability in discriminating BEOT vs. early-stage MEOT, with AUCs of 0.909 and 0.920, respectively. Radiologist performance was considerably poorer than both the internal (mean AUC = 0.792; range, 0.679-0.924) and external (mean AUC = 0.797; range, 0.744-0.867) validation cohorts.
    DATA CONCLUSION: Performance of the MRI-based ML model was robust and superior to subjective assessment of radiologists. If our approach can be implemented in clinical practice, improved preoperative prediction could potentially lead to preserved ovarian function and fertility for some women.
    LEVEL OF EVIDENCE: Level 4.
    TECHNICAL EFFICACY: Stage 2.
    Keywords:  borderline epithelial ovarian tumor; machine learning; magnetic resonance imaging; malignant epithelial ovarian tumor; preoperative prediction
    DOI:  https://doi.org/10.1002/jmri.27084
  10. Eur Radiol. 2020 Feb 14.
       OBJECTIVES: (1) To assess the methodological quality of radiomics studies investigating histological subtypes, therapy response, and survival in patients with renal cell carcinoma (RCC) and (2) to determine the risk of bias in these radiomics studies.
    METHODS: In this systematic review, literature published since 2000 on radiomics in RCC was included and assessed for methodological quality using the Radiomics Quality Score. The risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies tool and a meta-analysis of radiomics studies focusing on differentiating between angiomyolipoma without visible fat and RCC was performed.
    RESULTS: Fifty-seven studies investigating the use of radiomics in renal cancer were identified, including 4590 patients in total. The average Radiomics Quality Score was 3.41 (9.4% of total) with good inter-rater agreement (ICC 0.96, 95% CI 0.93-0.98). Three studies validated results with an independent dataset, one used a publically available validation dataset. None of the studies shared the code, images, or regions of interest. The meta-analysis showed moderate heterogeneity among the included studies and an odds ratio of 6.24 (95% CI 4.27-9.12; p < 0.001) for the differentiation of angiomyolipoma without visible fat from RCC.
    CONCLUSIONS: Radiomics algorithms show promise for answering clinical questions where subjective interpretation is challenging or not established. However, the generalizability of findings to prospective cohorts needs to be demonstrated in future trials for progression towards clinical translation. Improved sharing of methods including code and images could facilitate independent validation of radiomics signatures.
    KEY POINTS: • Studies achieved an average Radiomics Quality Score of 10.8%. Common reasons for low Radiomics Quality Scores were unvalidated results, retrospective study design, absence of open science, and insufficient control for multiple comparisons. • A previous training phase allowed reaching almost perfect inter-rater agreement in the application of the Radiomics Quality Score. • Meta-analysis of radiomics studies distinguishing angiomyolipoma without visible fat from renal cell carcinoma show moderate diagnostic odds ratios of 6.24 and moderate methodological diversity.
    Keywords:  Angiomyolipoma; Carcinoma, renal cell; Machine learning; Quality improvement; Systematic review
    DOI:  https://doi.org/10.1007/s00330-020-06666-3
  11. Eur Radiol. 2020 Feb 13.
       INTRODUCTION: The aim of the study was to extract anthropometric measures from CT by deep learning and to evaluate their prognostic value in patients with non-small-cell lung cancer (NSCLC).
    METHODS: A convolutional neural network was trained to perform automatic segmentation of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), and muscular body mass (MBM) from low-dose CT images in 189 patients with NSCLC who underwent pretherapy PET/CT. After a fivefold cross-validation in a subset of 35 patients, anthropometric measures extracted by deep learning were normalized to the body surface area (BSA) to control the various patient morphologies. VAT/SAT ratio and clinical parameters were included in a Cox proportional-hazards model for progression-free survival (PFS) and overall survival (OS).
    RESULTS: Inference time for a whole volume was about 3 s. Mean Dice similarity coefficients in the validation set were 0.95, 0.93, and 0.91 for SAT, VAT, and MBM, respectively. For PFS prediction, T-stage, N-stage, chemotherapy, radiation therapy, and VAT/SAT ratio were associated with disease progression on univariate analysis. On multivariate analysis, only N-stage (HR = 1.7 [1.2-2.4]; p = 0.006), radiation therapy (HR = 2.4 [1.0-5.4]; p = 0.04), and VAT/SAT ratio (HR = 10.0 [2.7-37.9]; p < 0.001) remained significant prognosticators. For OS, male gender, smoking status, N-stage, a lower SAT/BSA ratio, and a higher VAT/SAT ratio were associated with mortality on univariate analysis. On multivariate analysis, male gender (HR = 2.8 [1.2-6.7]; p = 0.02), N-stage (HR = 2.1 [1.5-2.9]; p < 0.001), and the VAT/SAT ratio (HR = 7.9 [1.7-37.1]; p < 0.001) remained significant prognosticators.
    CONCLUSION: The BSA-normalized VAT/SAT ratio is an independent predictor of both PFS and OS in NSCLC patients.
    KEY POINTS: • Deep learning will make CT-derived anthropometric measures clinically usable as they are currently too time-consuming to calculate in routine practice. • Whole-body CT-derived anthropometrics in non-small-cell lung cancer are associated with progression-free survival and overall survival. • A priori medical knowledge can be implemented in the neural network loss function calculation.
    Keywords:  Adiposity; Lung cancer; Machine learning; Tomography, X-ray computed
    DOI:  https://doi.org/10.1007/s00330-019-06630-w
  12. BMJ Open Diabetes Res Care. 2020 Jan;pii: e000892. [Epub ahead of print]8(1):
       INTRODUCTION: The aim of this study is to evaluate the performance of the offline smart phone-based Medios artificial intelligence (AI) algorithm in the diagnosis of diabetic retinopathy (DR) using non-mydriatic (NM) retinal images.
    METHODS: This cross-sectional study prospectively enrolled 922 individuals with diabetes mellitus. NM retinal images (disc and macula centered) from each eye were captured using the Remidio NM fundus-on-phone (FOP) camera. The images were run offline and the diagnosis of the AI was recorded (DR present or absent). The diagnosis of the AI was compared with the image diagnosis of five retina specialists (majority diagnosis considered as ground truth).
    RESULTS: Analysis included images from 900 individuals (252 had DR). For any DR, the sensitivity and specificity of the AI algorithm was found to be 83.3% (95% CI 80.9% to 85.7%) and 95.5% (95% CI 94.1% to 96.8%). The sensitivity and specificity of the AI algorithm in detecting referable DR (RDR) was 93% (95% CI 91.3% to 94.7%) and 92.5% (95% CI 90.8% to 94.2%).
    CONCLUSION: The Medios AI has a high sensitivity and specificity in the detection of RDR using NM retinal images.
    Keywords:  algorithms; non-mydriatic camera; retinopathy diagnosis; technology and diabetes
    DOI:  https://doi.org/10.1136/bmjdrc-2019-000892
  13. Comput Methods Programs Biomed. 2020 Feb 01. pii: S0169-2607(19)31436-1. [Epub ahead of print]190 105381
    Taiwan Stroke Registry Investigators
       INTRODUCTION: Being able to predict functional outcomes after a stroke is highly desirable for clinicians. This allows clinicians to set reasonable goals with patients and relatives, and to reach shared after-care decisions for recovery or rehabilitation. The aim of this study was to apply various machine learning (ML) methods for 90-day stroke outcome predictions, using a nationwide disease registry.
    METHODS: This study used the Taiwan Stroke Registry (TSR) which has prospectively collected data from stroke patients since 2006. Three known ML models (support vector machine, random forest, and artificial neural network), and a hybrid artificial neural network were implemented and evaluated by 10-time repeated hold-out with 10-fold cross-validation.
    RESULTS: ML techniques present over 0.94 AUC in both ischemic and hemorrhagic stroke using preadmission and inpatient data. By adding follow-up data, the prediction ability improved to 0.97 AUC. We screened 206 clinical variables to identify 17 important features from the ischemic stroke dataset and 22 features from the hemorrhagic stroke dataset without losing much performance. Error analysis revealed that most prediction errors come from more severe stroke patients.
    CONCLUSION: The study showed that ML techniques trained from large, cross-reginal registry datasets were able to predict functional outcome after stroke with high accuracy. The follow-up data is important which can further improve the predictive models' performance. With similar performances among different ML techniques, the algorithm's characteristics and performance on severe stroke patients will be the primary focus when we further develop inference models and artificial intelligence tools for potential medical.
    Keywords:  Hemorrhagic stroke; Ischemic stroke; Machine learning; Stroke outcome
    DOI:  https://doi.org/10.1016/j.cmpb.2020.105381
  14. Clin Neurol Neurosurg. 2020 Feb 10. pii: S0303-8467(20)30075-5. [Epub ahead of print]192 105732
       OBJECTIVES: Neurosurgical audits are an important part of improving the safety, efficiency and quality of care but require considerable resources, time, and funding. To that end, the advent of the Artificial Intelligence-based algorithms offered a novel, more economically viable solution. The aim of the study was to evaluate whether the algorithm can indeed outperform humans in that task.
    PATIENTS & METHODS: Forty-six human students were invited to inspect the clinical notes of 45 medical outliers on a neurosurgical ward. The aim of the task was to produce a report containing a quantitative analysis of the scale of the problem (e.g. time to discharge) and a qualitative list of suggestions on how to improve the patient flow, quality of care, and healthcare costs. The Artificial Intelligence-based Frideswide algorithm (FwA) was used to analyse the same dataset.
    RESULTS: The FwA produced 44 recommendations whilst human students reported an average of 3.89. The mean time to deliver the final report was 5.80 s for the FwA and 10.21 days for humans. The mean relative error for factual inaccuracy for humans was 14.75 % for total waiting times and 81.06 % for times between investigations. The report produced by the FwA was entirely factually correct. 13 out of 46 students submitted an unfinished audit, 3 out of 46 made an overdue submission. Thematic analysis revealed numerous internal contradictions of the recommendations given by human students.
    CONCLUSION: The AI-based algorithm can produce significantly more recommendations in shorter time. The audits conducted by the AI are more factually accurate (0 % error rate) and logically consistent (no thematic contradictions). This study shows that the algorithm can produce reliable neurosurgical audits for a fraction of the resources required to conduct it by human means.
    Keywords:  Artificial intelligence; Computational neuroscience; Medical audits; Neurosurgery audits
    DOI:  https://doi.org/10.1016/j.clineuro.2020.105732
  15. Evid Based Ment Health. 2020 Feb;23(1): 34-38
       BACKGROUND: All patients admitted to an acute inpatient mental health unit must have nursing observations carried out at night either hourly or every 15 minutes, to ascertain that they are safe and breathing. However, while this practice ensures patient safety, it can also disturb patients' sleep, which in turn can impact negatively on their recovery.
    OBJECTIVE: This article describes the process of introducing artificial intelligence ('digitally assisted nursing observations') in an acute mental health inpatient ward, to enable staff to carry out the hourly and the 15 minutes observations, minimising disruption of patients' sleep while maintaining their safety.
    FINDINGS: The preliminary data obtained indicate that the digitally assisted nursing observations agreed with the observations without sensors when both were carried out in parallel and that over an estimated 755 patient nights, the new system has not been associated with any untoward incidents. Preliminary qualitative data suggest that the new technology improves patients' and staff's experience at night.
    DISCUSSION: This project suggests that the digitally assisted nursing observations could maintain patients' safety while potentially improving patients' and staff's experience in an acute psychiatric ward. The limitations of this study, namely, its narrative character and the fact that patients were not randomised to the new technology, suggest taking the reported findings as qualitative and preliminary.
    CLINICAL IMPLICATIONS: These results suggest that the care provided at night in acute inpatient psychiatric units could be substantially improved with this technology. This warrants a more thorough and stringent evaluation.
    Keywords:  Schizophrenia & psychotic disorders; adult psychiatry; depression & mood disorders; suicide & self-harm
    DOI:  https://doi.org/10.1136/ebmental-2019-300136
  16. Spine J. 2020 Feb 08. pii: S1529-9430(20)30047-4. [Epub ahead of print]
       BACKGROUND: Degenerative cervical myelopathy (DCM) is the most common cause of spinal cord dysfunction worldwide. Current guidelines recommend management based on the severity of myelopathy, measured by the modified Japanese Orthopedic Association (mJOA) score. Patients with moderate to severe myelopathy, defined by an mJOA below 15, are recommended to undergo surgery. However, the management for mild myelopathy (mJOA between 15 and 17) is controversial since the response to surgery is more heterogeneous.
    PURPOSE: To develop machine learning algorithms predicting phenotypes of mild myelopathy patients that would benefit most from surgery.
    STUDY DESIGN: Retrospective subgroup analysis of prospectively collected data.
    PATIENT SAMPLES: Data were obtained from 193 mild DCM patients who underwent surgical decompression and were enrolled in the multicenter AOSpine CSM clinical trials.
    OUTCOME MEASURES: The mJOA score, an assessment of functional status, was used to isolate patients with mild DCM. The primary outcome measures were change from baseline for the Short Form-36 (SF-36) mental component summary (MCS) and physical component summary (PCS) at 1-year post-surgery. These changes were dichotomized according to whether they exceeded the minimal clinically important difference (MCID).
    METHODS: The data were split into training (75%) and testing (25%) sets. Model predictors included baseline demographic variables and clinical presentation. Seven machine learning algorithms and a logistic regression model were trained and optimized using the training set, and their performances was evaluated using the testing set. For each outcome (improvement in MCS or PCS), the ML algorithm with the greatest AUC on the training set was selected for further analysis.
    RESULTS: The generalized boosted model (GBM) and earth models performed well in the prediction of significant improvement in MCS and PCS respectively, with AUCs of 0.72-0.78 on the training set. This performance was replicated on the testing set, in which the GBM and earth models showed AUCs of 0.77 and 0.78 respectively, as well as fair to good calibration across the predicted range of probabilities. Female patients with a low initial MCS were less likely to experience significant improvement in MCS than males. The presence of certain signs and symptoms (e.g. lower limb spasticity, clumsy hands) were also predictive of worse outcome.
    CONCLUSIONS: Machine learning models showed good predictive power and provided information about the phenotypes of mild DCM patients most likely to benefit from surgical intervention. Overall, machine learning may be a useful tool for management of mild DCM, though external validation and prospective analysis should be performed to better solidify its role.
    Keywords:  Epidemiology; Mild Degenerative Cervical Myelopathy; Predictive Models; Quality-Of-Life Outcomes; Spine Surgery; degenerative spine disease; machine learning
    DOI:  https://doi.org/10.1016/j.spinee.2020.02.003
  17. Healthcare (Basel). 2020 Feb 07. pii: E34. [Epub ahead of print]8(1):
      Parkinson's disease is caused due to the progressive loss of dopaminergic neurons in the substantia nigra pars compacta (SNc). Presently, with the exponential growth of the aging population across the world the number of people being affected by the disease is also increasing and it imposes a huge economic burden on the governments. However, to date, no therapy or treatment has been found that can completely eradicate the disease. Therefore, early detection of Parkinson's disease is very important so that the progressive loss of dopaminergic neurons can be controlled to provide the patients with a better life. In this study, 3T T1-MRI scans were collected from 906 subjects, out of which, 203 are control subjects, 66 are prodromal subjects and 637 are Parkinson's disease patients. To analyze the MRI scans for the detection of neurodegeneration and Parkinson's disease, eight subcortical structures were segmented from the acquired MRI scans using atlas based segmentation. Further, on the extracted eight subcortical structures, feature extraction was performed to extract textural, morphological and statistical features, respectively. After the feature extraction process, an exhaustive set of 107 features were generated for each MRI scan. Therefore, a two-level feature extraction process was implemented for finding the best possible feature set for the detection of Parkinson's disease. The two-level feature extraction procedure leveraged correlation analysis and recursive feature elimination, which at the end provided us with 20 best performing features out of the extracted 107 features. Further, all the features were trained using machine learning algorithms and a comparative analysis was performed between four different machine learning algorithms based on the selected performance metrics. And at the end, it was observed that artificial neural network (multi-layer perceptron) performed the best by providing an overall accuracy of 95.3%, overall recall of 95.41%, overall precision of 97.28% and f1-score of 94%, respectively.
    Keywords:  3D; MRI; Parkinson’s disease; morphology; neurodegeneration; texture
    DOI:  https://doi.org/10.3390/healthcare8010034
  18. Neuroscience. 2020 Feb 11. pii: S0306-4522(20)30079-8. [Epub ahead of print]
      The application of Resting State functional MRI (RS-fMRI) in Parkinson's disease was widely performed using standard statistical tests, however, the machine learning approach has not yet been investigated in PD using RS-fMRI. In current study, we utilized the mean regional amplitude values as the features in patients with PD (n = 72) and in healthy controls (HC, n = 89). The t-test and linear support vector machine were employed to select the features and make prediction, respectively. Three frequency bins (Slow-5: 0.0107 - 0.0286 Hz; Slow-4: 0.0286 - 0.0821 Hz; Conventional: 0.01 - 0.08 Hz) were analyzed. Our results showed that the Slow-4 may provide important information than Slow-5 in PD, and it had almost identical classification performance compared with the Combined (Slow-5 and Slow-4) and Conventional frequency bands. Similar with previous neuroimaging studies in PD, the discriminative regions were mainly included the disrupted motor system, aberrant visual cortex, dysfunction of paralimbic/limbic and basal ganglia networks. The lateral parietal lobe, such as right IPL and SMG, was detected as the discriminative features exclusively in Slow-4. Our findings, at the first time, indicated that the machine learning approach is a promising choice for detecting abnormal regions in PD, and a multi-frequency scheme would provide us more specific information.
    Keywords:  ALFF; Frequency Specificity; Machine Learning; Parkinson’s Disease; Resting Brain
    DOI:  https://doi.org/10.1016/j.neuroscience.2020.01.049
  19. Alzheimers Dement. 2020 Feb 11.
    Alzheimer's disease neuroimaging initiative (ADNI)
       INTRODUCTION: Developing cross-validated multi-biomarker models for the prediction of the rate of cognitive decline in Alzheimer's disease (AD) is a critical yet unmet clinical challenge.
    METHODS: We applied support vector regression to AD biomarkers derived from cerebrospinal fluid, structural magnetic resonance imaging (MRI), amyloid-PET and fluorodeoxyglucose positron-emission tomography (FDG-PET) to predict rates of cognitive decline. Prediction models were trained in autosomal-dominant Alzheimer's disease (ADAD, n = 121) and subsequently cross-validated in sporadic prodromal AD (n = 216). The sample size needed to detect treatment effects when using model-based risk enrichment was estimated.
    RESULTS: A model combining all biomarker modalities and established in ADAD predicted the 4-year rate of decline in global cognition (R2 = 24%) and memory (R2 = 25%) in sporadic AD. Model-based risk-enrichment reduced the sample size required for detecting simulated intervention effects by 50%-75%.
    DISCUSSION: Our independently validated machine-learning model predicted cognitive decline in sporadic prodromal AD and may substantially reduce sample size needed in clinical trials in AD.
    Keywords:  Alzheimer's disease; MRI; PET; autosomal-dominant Alzheimer's disease; biomarkers; machine learning; progression prediction; risk enrichment
    DOI:  https://doi.org/10.1002/alz.12032
  20. Int J Med Inform. 2020 Feb 04. pii: S1386-5056(19)31364-4. [Epub ahead of print]136 104094
       INTRODUCTION: Research has shown that frailty, a geriatric syndrome associated with an increased risk of negative outcomes for older people, is highly prevalent among residents of residential aged care facilities (also called long term care facilities or nursing homes). However, progress on effective identification of frailty within residential care remains at an early stage, necessitating the development of new methods for accurate and efficient screening.
    OBJECTIVES: We aimed to determine the effectiveness of artificial intelligence (AI) algorithms in accurately identifying frailty among residents aged 75 years and over in comparison with a calculated electronic Frailty Index (eFI) based on a routinely-collected residential aged care administrative data set drawn from 10 residential care facilities located in Queensland, Australia. A secondary objective included the identification of best-performing candidate algorithms.
    METHODS: We designed a frailty prediction system based on the eFI identification of frailty, allocating 84.5 % and 15.5 % of the data to training and test data sets respectively. We compared the performance of 18 specific scenarios to predict frailty against eFI based on unique combinations of three ML algorithms (support vector machines [SVM], decision trees [DT] and K-nearest neighbours [KNN]) and six cases (6, 10, 11, 14, 39 and 70 input variables). We calculated accuracy, percentage positive and negative agreement, sensitivity, specificity, Cohen's kappa and Prevalence- and Bias- Adjusted Kappa (PABAK), table frequencies and positive and negative predictive values.
    RESULTS: Of 592 eligible resident records, 500 were allocated to the training set and 92 to the test set. Three scenarios (10, 11 and 70 input variables), all based on SVM algorithm, returned overall accuracy above 75 %.
    CONCLUSIONS: There is some potential for AI techniques to contribute towards better frailty identification within residential care. However, potential benefits will need to be weighed against administrative burden, data quality concerns and presence of potential bias.
    Keywords:  Artificial intelligence; Frailty; Health records; Machine learning; Personal; Residential facilities
    DOI:  https://doi.org/10.1016/j.ijmedinf.2020.104094
  21. Br J Surg. 2020 Feb 11.
       BACKGROUND: Acute aortic syndrome (AAS) comprises a complex and potentially fatal group of conditions requiring emergency specialist management. The aim of this study was to build a prediction algorithm to assist prehospital triage of AAS.
    METHODS: Details of consecutive patients enrolled in a regional specialist aortic network were collected prospectively. Two prediction algorithms for AAS based on logistic regression and an ensemble machine learning method called SuperLearner (SL) were developed. Undertriage was defined as the proportion of patients with AAS not transported to the specialist aortic centre, and overtriage as the proportion of patients with alternative diagnoses but transported to the specialist aortic centre.
    RESULTS: Data for 976 hospital admissions between February 2010 and June 2017 were included; 609 (62·4 per cent) had AAS. Overtriage and undertriage rates were 52·3 and 16·1 per cent respectively. The population was divided into a training cohort (743 patients) and a validation cohort (233). The area under the receiver operating characteristic (ROC) curve values for the logistic regression score and the SL were 0·68 (95 per cent c.i. 0·64 to 0·72) and 0·87 (0·84 to 0·89) respectively (P < 0·001) in the training cohort, and 0·67 (0·60 to 0·74) and 0·73 (0·66 to 0·79) in the validation cohort (P = 0·038). The logistic regression score was associated with undertriage and overtriage rates of 33·7 (bootstrapped 95 per cent c.i. 29·3 to 38·3) and 7·2 (4·8 to 9·8) per cent respectively, whereas the SL yielded undertriage and overtriage rates of 1·0 (0·3 to 2·0) and 30·2 (25·8 to 34·8) per cent respectively.
    CONCLUSION: A machine learning prediction model performed well in discriminating AAS and could be clinically useful in prehospital triage of patients with suspected AAS.
    DOI:  https://doi.org/10.1002/bjs.11442