bims-arihec Biomed News
on Artificial Intelligence in Healthcare
Issue of 2019‒12‒08
twenty-two papers selected by
Céline Bélanger
Cogniges Inc.


  1. J Med Imaging Radiat Sci. 2019 Nov 29. pii: S1939-8654(19)30543-0. [Epub ahead of print]
    Wiljer D, Hakim Z.
      Artificial intelligence (AI) has the potential to impact almost every aspect of health care, from detection to prediction and prevention. The adoption of new technologies in health care, however, lags far behind the emergence of new technologies. Health care professionals and organizations must be prepared to change and evolve to adopt these new technologies. A basic understanding of emerging AI technologies will be essential for all health care professionals. These technologies include expert systems, robotic process automation, natural language processing, machine learning, and deep learning. Health care professionals and organizations must build their capacity and capabilities to understand and appropriately adopt these technologies. This understanding starts with basic AI literacy, including data governance principles, basic statistics, data visualization, and the impact on clinical processes. Health care professionals and organizations will need to overcome several challenges and tackle core structural issues, such as access to data and the readiness of algorithms for clinical practice. However, health care professionals have an opportunity to shape the way that AI will be used and the outcomes that will be achieved. There is an urgent and emerging need for education and training so that appropriate technologies can be rapidly adopted, resulting in a healthier world for our patients and our communities.
    Keywords:  AI literacy; AI-enabled health professions education; Artificial intelligence; digital healthcare; professional development
    DOI:  https://doi.org/10.1016/j.jmir.2019.09.010
  2. Eur J Radiol. 2019 Nov 23. pii: S0720-048X(19)30418-8. [Epub ahead of print]122 108768
    Safdar NM, Banja JD, Meltzer CC.
      With artificial intelligence (AI) precipitously perched at the apex of the hype curve, the promise of transforming the disparate fields of healthcare, finance, journalism, and security and law enforcement, among others, is enormous. For healthcare - particularly radiology - AI is anticipated to facilitate improved diagnostics, workflow, and therapeutic planning and monitoring. And, while it is also causing some trepidation among radiologists regarding its uncertain impact on the demand and training of our current and future workforce, most of us welcome the potential to harness AI for transformative improvements in our ability to diagnose disease more accurately and earlier in the populations we serve.
    Keywords:  Artificial intelligence; Ethics; Machine learning; Radiology
    DOI:  https://doi.org/10.1016/j.ejrad.2019.108768
  3. Jt Comm J Qual Patient Saf. 2019 Nov 27. pii: S1553-7250(19)30396-4. [Epub ahead of print]
    Rozenblum R, Rodriguez-Monguio R, Volk LA, Forsythe KJ, Myers S, McGurrin M, Williams DH, Bates DW, Schiff G, Seoane-Vazquez E.
      BACKGROUND: Clinical decision support (CDS) alerting tools can identify and reduce medication errors. However, they are typically rule-based and can identify only the errors previously programmed into their alerting logic. Machine learning holds promise for improving medication error detection and reducing costs associated with adverse events. This study evaluates the ability of a machine learning system (MedAware) to generate clinically valid alerts and estimates the cost savings associated with potentially prevented adverse events.METHODS: Alerts were generated retrospectively by the MedAware system on outpatient data from two academic medical centers between 2009 and 2013. MedAware alerts were compared to alerts in an existing CDS system. A random sample of 300 alerts was selected for medical record review. Frequency and severity of potential outcomes of alerted medication errors of medium and high clinical value were estimated, along with associated health care costs of these potentially prevented adverse events.
    RESULTS: A total of 10,668 alerts were generated. Overall, 68.2% of MedAware alerts would not have been generated by the existing CDS system. Ninety-two percent of a random sample of the chart-reviewed alerts were accurate based on structured data available in the record, and 79.7% were clinically valid. Estimated cost of adverse events potentially prevented in an outpatient setting was more than $60 per drug alert and $1.3 million when extrapolating study findings to the full patient population.
    CONCLUSION: A machine learning system identified clinically valid medication error alerts that might otherwise be missed with existing CDS systems. Estimates show potential for cost savings associated with potentially prevented adverse events.
    DOI:  https://doi.org/10.1016/j.jcjq.2019.09.008
  4. J Am Med Inform Assoc. 2019 Dec 03. pii: ocz200. [Epub ahead of print]
    Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, Soni S, Wang Q, Wei Q, Xiang Y, Zhao B, Xu H.
      OBJECTIVE: This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research.MATERIALS AND METHODS: We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers.
    RESULTS: DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a "long tail" of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific.
    DISCUSSION: Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning).
    CONCLUSION: Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.
    Keywords:  deep learning; electronic health records; methodical, review, clinical text; natural language processing
    DOI:  https://doi.org/10.1093/jamia/ocz200
  5. Europace. 2019 Dec 04. pii: euz324. [Epub ahead of print]
    Kwon JM, Jeon KH, Kim HM, Kim MJ, Lim SM, Kim KH, Song PS, Park J, Choi RK, Oh BH.
      AIMS : Although left ventricular hypertrophy (LVH) has a high incidence and clinical importance, the conventional diagnosis criteria for detecting LVH using electrocardiography (ECG) has not been satisfied. We aimed to develop an artificial intelligence (AI) algorithm for detecting LVH.METHODS AND RESULTS: This retrospective cohort study involved the review of 21 286 patients who were admitted to two hospitals between October 2016 and July 2018 and underwent 12-lead ECG and echocardiography within 4 weeks. The patients in one hospital were divided into a derivation and internal validation dataset, while the patients in the other hospital were included in only an external validation dataset. An AI algorithm based on an ensemble neural network (ENN) combining convolutional and deep neural network was developed using the derivation dataset. And we visualized the ECG area that the AI algorithm used to make the decision. The area under the receiver operating characteristic curve of the AI algorithm based on ENN was 0.880 (95% confidence interval 0.877-0.883) and 0.868 (0.865-0.871) during the internal and external validations. These results significantly outperformed the cardiologist's clinical assessment with Romhilt-Estes point system and Cornell voltage criteria, Sokolov-Lyon criteria, and interpretation of ECG machine. At the same specificity, the AI algorithm based on ENN achieved 159.9%, 177.7%, and 143.8% higher sensitivities than those of the cardiologist's assessment, Sokolov-Lyon criteria, and interpretation of ECG machine.
    CONCLUSION : An AI algorithm based on ENN was highly able to detect LVH and outperformed cardiologists, conventional methods, and other machine learning techniques.
    Keywords:   Machine learning; Artificial intelligence; Deep learning; Electrocardiography; Hypertrophy; Left ventricular
    DOI:  https://doi.org/10.1093/europace/euz324
  6. Atherosclerosis. 2019 Nov 02. pii: S0021-9150(19)31552-7. [Epub ahead of print]292 171-177
    Lee JG, Ko J, Hae H, Kang SJ, Kang DY, Lee PH, Ahn JM, Park DW, Lee SW, Kim YH, Lee CW, Park SW, Park SJ.
      BACKGROUND AND AIMS: Intravascular ultrasound (IVUS)-derived morphological criteria are poor predictors of the functional significance of intermediate coronary stenosis. IVUS-based supervised machine learning (ML) algorithms were developed to identify lesions with a fractional flow reserve (FFR) ≤0.80 (vs. >0.80).METHODS: A total of 1328 patients with 1328 non-left main coronary lesions were randomized into training and test sets in a 4:1 ratio. Masked IVUS images were generated by an automatic segmentation model, and 99 computed IVUS features and six clinical variables (age, gender, body surface area, vessel type, involved segment, and involvement of the proximal left anterior descending artery) were used for ML training with 5-fold cross-validation. Diagnostic performances of the binary classifiers (L2 penalized logistic regression, artificial neural network, random forest, AdaBoost, CatBoost, and support vector machine) for detecting ischemia-producing lesions were evaluated using the non-overlapping test samples.
    RESULTS: In the classification of test set lesions into those with an FFR ≤0.80 vs. >0.80, the overall diagnostic accuracies for predicting an FFR ≤0.80 were 82% with L2 penalized logistic regression, 80% with artificial neural network, 83% with random forest, 83% with AdaBoost, 81% with CatBoost, and 81% with support vector machine (AUCs: 0.84-0.87). With exclusion of the 28 lesions with borderline FFR of 0.75-0.80, the overall accuracies for the test set were 86% with L2 penalized logistic regression, 85% with an artificial neural network, 87% with random forest, 87% with AdaBoost, 85% with CatBoost, and 85% with support vector machine.
    CONCLUSIONS: The IVUS-based ML algorithms showed good diagnostic performance for identifying ischemia-producing lesions, and may reduce the need for pressure wires.
    Keywords:  Artificial intelligence; Fractional flow reserve; Intravascular ultrasound; Machine learning
    DOI:  https://doi.org/10.1016/j.atherosclerosis.2019.10.022
  7. Pulm Circ. 2019 Oct-Dec;9(4):9(4): 2045894019890549
    Kiely DG, Doyle O, Drage E, Jenner H, Salvatelli V, Daniels FA, Rigg J, Schmitt C, Samyshkin Y, Lawrie A, Bergemann R.
      Idiopathic pulmonary arterial hypertension is a rare and life-shortening condition often diagnosed at an advanced stage. Despite increased awareness, the delay to diagnosis remains unchanged. This study explores whether a predictive model based on healthcare resource utilisation can be used to screen large populations to identify patients at high risk of idiopathic pulmonary arterial hypertension. Hospital Episode Statistics from the National Health Service in England, providing close to full national coverage, were used as a measure of healthcare resource utilisation. Data for patients with idiopathic pulmonary arterial hypertension from the National Pulmonary Hypertension Service in Sheffield were linked to pre-diagnosis Hospital Episode Statistics records. A non-idiopathic pulmonary arterial hypertension control cohort was selected from the Hospital Episode Statistics population. Patient history was limited to ≤5 years pre-diagnosis. Information on demographics, timing/frequency of diagnoses, medical specialities visited and procedures undertaken was captured. For modelling, a bagged gradient boosting trees algorithm was used to discriminate between cohorts. Between 2008 and 2016, 709 patients with idiopathic pulmonary arterial hypertension were identified and compared with a stratified cohort of 2,812,458 patients classified as non-idiopathic pulmonary arterial hypertension with ≥1 ICD-10 coded diagnosis of relevance to idiopathic pulmonary arterial hypertension. A predictive model was developed and validated using cross-validation. The timing and frequency of the clinical speciality seen, secondary diagnoses and age were key variables driving the algorithm's performance. To identify the 100 patients at highest risk of idiopathic pulmonary arterial hypertension, 969 patients would need to be screened with a specificity of 99.99% and sensitivity of 14.10% based on a prevalence of 5.5/million. The positive predictive and negative predictive values were 10.32% and 99.99%, respectively. This study highlights the potential application of artificial intelligence to readily available real-world data to screen for rare diseases such as idiopathic pulmonary arterial hypertension. This algorithm could provide low-cost screening at a population level, facilitating earlier diagnosis, improved diagnostic rates and patient outcomes. Studies to further validate this approach are warranted.
    Keywords:  diagnosis; idiopathic pulmonary arterial hypertension (PAH); machine learning; predictive algorithm
    DOI:  https://doi.org/10.1177/2045894019890549
  8. Future Cardiol. 2019 Dec 03.
    Suzuki R, Katada J, Ramagopalan S, McDonald L.
      Aim: Nonvalvular atrial fibrillation (NVAF) is associated with an increased risk of stroke however many patients are diagnosed after onset. This study assessed the potential of machine-learning algorithms to detect NVAF. Materials & methods: A retrospective database study using a Japanese claims database. Patients with and without NVAF were selected. 41 variables were included in different classification algorithms. Results: Machine learning algorithms identified NVAF with an area under the curve of >0.86; corresponding sensitivity/specificity was also high. The stacking model which combined multiple algorithms outperformed single-model approaches (area under the curve ≥0.90, sensitivity/specificity ≥0.80/0.82), although differences were small. Conclusion: Machine-learning based algorithms can detect atrial fibrillation with accuracy. Although additional validation is needed, this methodology could encourage a new approach to detect NVAF.
    Keywords:  Japan; algorithm; atrial fibrillation; early detection; machine learning
    DOI:  https://doi.org/10.2217/fca-2019-0056
  9. Eur Radiol. 2019 Dec 06.
    Gong J, Liu J, Hao W, Nie S, Zheng B, Wang S, Peng W.
      OBJECTIVE: To develop a deep learning-based artificial intelligence (AI) scheme for predicting the likelihood of the ground-glass nodule (GGN) detected on CT images being invasive adenocarcinoma (IA) and also compare the accuracy of this AI scheme with that of two radiologists.METHODS: First, we retrospectively collected 828 histopathologically confirmed GGNs of 644 patients from two centers. Among them, 209 GGNs are confirmed IA and 619 are non-IA, including 409 adenocarcinomas in situ and 210 minimally invasive adenocarcinomas. Second, we applied a series of pre-preprocessing techniques, such as image resampling, rescaling and cropping, and data augmentation, to process original CT images and generate new training and testing images. Third, we built an AI scheme based on a deep convolutional neural network by using a residual learning architecture and batch normalization technique. Finally, we conducted an observer study and compared the prediction performance of the AI scheme with that of two radiologists using an independent dataset with 102 GGNs.
    RESULTS: The new AI scheme yielded an area under the receiver operating characteristic curve (AUC) of 0.92 ± 0.03 in classifying between IA and non-IA GGNs, which is equivalent to the senior radiologist's performance (AUC 0.92 ± 0.03) and higher than the score of the junior radiologist (AUC 0.90 ± 0.03). The Kappa value of two sets of subjective prediction scores generated by two radiologists is 0.6.
    CONCLUSIONS: The study result demonstrates using an AI scheme to improve the performance in predicting IA, which can help improve the development of a more effective personalized cancer treatment paradigm.
    KEY POINTS: • The feasibility of using a deep learning method to predict the likelihood of the ground-glass nodule being invasive adenocarcinoma. • Residual learning-based CNN model improves the performance in classifying between IA and non-IA nodules. • Artificial intelligence (AI) scheme yields higher performance than radiologists in predicting invasive adenocarcinoma.
    Keywords:  Carcinoma; Computer-assisted image interpretation; Lung neoplasms; Multiple pulmonary nodules; X-Ray computed tomography scanners
    DOI:  https://doi.org/10.1007/s00330-019-06533-w
  10. Diagnostics (Basel). 2019 Nov 29. pii: E207. [Epub ahead of print]9(4):
    Li D, Mikela Vilmun B, Frederik Carlsen J, Albrecht-Beste E, Ammitzbøl Lauridsen C, Bachmann Nielsen M, Lindskov Hansen K.
      The aim of this study was to systematically review the performance of deep learning technology in detecting and classifying pulmonary nodules on computed tomography (CT) scans that were not from the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) database. Furthermore, we explored the difference in performance when the deep learning technology was applied to test datasets different from the training datasets. Only peer-reviewed, original research articles utilizing deep learning technology were included in this study, and only results from testing on datasets other than the LIDC-IDRI were included. We searched a total of six databases: EMBASE, PubMed, Cochrane Library, the Institute of Electrical and Electronics Engineers, Inc. (IEEE), Scopus, and Web of Science. This resulted in 1782 studies after duplicates were removed, and a total of 26 studies were included in this systematic review. Three studies explored the performance of pulmonary nodule detection only, 16 studies explored the performance of pulmonary nodule classification only, and 7 studies had reports of both pulmonary nodule detection and classification. Three different deep learning architectures were mentioned amongst the included studies: convolutional neural network (CNN), massive training artificial neural network (MTANN), and deep stacked denoising autoencoder extreme learning machine (SDAE-ELM). The studies reached a classification accuracy between 68-99.6% and a detection accuracy between 80.6-94%. Performance of deep learning technology in studies using different test and training datasets was comparable to studies using same type of test and training datasets. In conclusion, deep learning was able to achieve high levels of accuracy, sensitivity, and/or specificity in detecting and/or classifying nodules when applied to pulmonary CT scans not from the LIDC-IDRI database.
    Keywords:  artificial intelligence; deep learning; nodule classification; nodule detection
    DOI:  https://doi.org/10.3390/diagnostics9040207
  11. Breast. 2019 Nov 23. pii: S0960-9776(19)31101-4. [Epub ahead of print]49 115-122
    Lo Gullo R, Eskreis-Winkler S, Morris EA, Pinker K.
      In patients with locally advanced breast cancer undergoing neoadjuvant chemotherapy (NAC), some patients achieve a complete pathologic response (pCR), some achieve a partial response, and some do not respond at all or even progress. Accurate prediction of treatment response has the potential to improve patient care by improving prognostication, enabling de-escalation of toxic treatment that has little benefit, facilitating upfront use of novel targeted therapies, and avoiding delays to surgery. Visual inspection of a patient's tumor on multiparametric MRI is insufficient to predict that patient's response to NAC. However, machine learning and deep learning approaches using a mix of qualitative and quantitative MRI features have recently been applied to predict treatment response early in the course of or even before the start of NAC. This is a novel field but the data published so far has shown promising results. We provide an overview of the machine learning and deep learning models developed to date, as well as discuss some of the challenges to clinical implementation.
    Keywords:  Artificial intelligence; Machine learning; Multiparametric MRI; Neoadjuvant chemotherapy
    DOI:  https://doi.org/10.1016/j.breast.2019.11.009
  12. J Clin Gastroenterol. 2019 Nov 26.
    Becq A, Chandnani M, Bharadwaj S, Baran B, Ernest-Suarez K, Gabr M, Glissen-Brown J, Sawhney M, Pleskow DK, Berzin TM.
      BACKGROUND: Colonoscopy is the gold standard for polyp detection, but polyps may be missed. Artificial intelligence (AI) technologies may assist in polyp detection. To date, most studies for polyp detection have validated algorithms in ideal endoscopic conditions.AIM: To evaluate the performance of a deep-learning algorithm for polyp detection in a real-world setting of routine colonoscopy with variable bowel preparation quality.
    METHODS: We performed a prospective, single-center study of 50 consecutive patients referred for colonoscopy. Procedural videos were analyzed by a validated deep-learning AI polyp detection software that labeled suspected polyps. Videos were then re-read by 5 experienced endoscopists to categorize all possible polyps identified by the endoscopist and/or AI, and to measure Boston Bowel Preparation Scale.
    RESULTS: In total, 55 polyps were detected and removed by the endoscopist. The AI system identified 401 possible polyps. A total of 100 (24.9%) were categorized as "definite polyps;" 53/100 were identified and removed by the endoscopist. A total of 63 (15.6%) were categorized as "possible polyps" and were not removed by the endoscopist. In total, 238/401 were categorized as false positives. Two polyps identified by the endoscopist were missed by AI (false negatives). The sensitivity of AI for polyp detection was 98.8%, the positive predictive value was 40.6%. The polyp detection rate for the endoscopist was 62% versus 82% for the AI system. Mean segmental Boston Bowel Preparation Scale were similar (2.64, 2.59, P=0.47) for true and false positives, respectively.
    CONCLUSIONS: A deep-learning algorithm can function effectively to detect polyps in a prospectively collected series of colonoscopies, even in the setting of variable preparation quality.
    DOI:  https://doi.org/10.1097/MCG.0000000000001272
  13. J Cancer Res Clin Oncol. 2019 Nov 30.
    Liu B, Chi W, Li X, Li P, Liang W, Liu H, Wang W, He J.
      PURPOSE: Lung cancer is the commonest cause of cancer deaths worldwide, and its mortality can be reduced significantly by performing early diagnosis and screening. Since the 1960s, driven by the pressing needs to accurately and effectively interpret the massive volume of chest images generated daily, computer-assisted diagnosis of pulmonary nodule has opened up new opportunities to relax the limitation from physicians' subjectivity, experiences and fatigue. And the fair access to the reliable and affordable computer-assisted diagnosis will fight the inequalities in incidence and mortality between populations. It has been witnessed that significant and remarkable advances have been achieved since the 1980s, and consistent endeavors have been exerted to deal with the grand challenges on how to accurately detect the pulmonary nodules with high sensitivity at low false-positive rate as well as on how to precisely differentiate between benign and malignant nodules. There is a lack of comprehensive examination of the techniques' development which is evolving the pulmonary nodules diagnosis from classical approaches to machine learning-assisted decision support. The main goal of this investigation is to provide a comprehensive state-of-the-art review of the computer-assisted nodules detection and benign-malignant classification techniques developed over three decades, which have evolved from the complicated ad hoc analysis pipeline of conventional approaches to the simplified seamlessly integrated deep learning techniques. This review also identifies challenges and highlights opportunities for future work in learning models, learning algorithms and enhancement schemes for bridging current state to future prospect and satisfying future demand.CONCLUSION: It is the first literature review of the past 30 years' development in computer-assisted diagnosis of lung nodules. The challenges indentified and the research opportunities highlighted in this survey are significant for bridging current state to future prospect and satisfying future demand. The values of multifaceted driving forces and multidisciplinary researches are acknowledged that will make the computer-assisted diagnosis of pulmonary nodules enter into the main stream of clinical medicine and raise the state-of-the-art clinical applications as well as increase both welfares of physicians and patients. We firmly hold the vision that fair access to the reliable, faithful, and affordable computer-assisted diagnosis for early cancer diagnosis would fight the inequalities in incidence and mortality between populations, and save more lives.
    Keywords:  Artificial intelligence; Computer-aided diagnosis; Deep learning; Lung cancer; Pulmonary nodules; Review
    DOI:  https://doi.org/10.1007/s00432-019-03098-5
  14. Spine (Phila Pa 1976). 2019 Dec 05.
    Maki S, Furuya T, Horikoshi T, Yokota H, Mori Y, Ota J, Kawasaki Y, Miyamoto T, Norimoto M, Okimatsu S, Shiga Y, Inage K, Orita S, Takahashi H, Suyari H, Uno T, Ohtori S.
      STUDY DESIGN: Retrospective analysis of magnetic resonance imaging (MRI) OBJECTIVE.: To evaluate the performance of our convolutional neural network (CNN) in differentiating between spinal schwannoma and meningioma on MRI. We compared the performance of the CNN and that of two expert radiologists.SUMMARY OF BACKGROUND DATA: Preoperative discrimination between spinal schwannomas and meningiomas is crucial because different surgical procedures are required for their treatment. A deep-learning approach based on CNNs is gaining interest in the medical imaging field.
    METHODS: We retrospectively reviewed data from patients with spinal schwannoma and meningioma who had undergone MRI and tumor resection. There were 50 patients with schwannoma and 34 patients with meningioma. Sagittal T2-weighted magnetic resonance imaging (T2WI) and sagittal contrast-enhanced T1-weighted magnetic resonance imaging (T1WI) were used for the CNN training and validation. The deep learning framework Tensorflow were used to construct the CNN architecture. To evaluate the performance of the CNN, we plotted the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC). We calculated and compared the sensitivity, specificity, and accuracy of the diagnosis by the CNN and two board-certified radiologists.
    RESULTS: . The AUC of ROC curves of the CNN based on T2WI and contrast-enhanced T1WI were 0.876 and 0.870, respectively. The sensitivity of the CNN based on T2WI was 78%; 100% for radiologist 1; and 95% for radiologist 2. The specificity was 82%, 26%, and 42%, respectively. The accuracy was 80%, 69%, and 73%, respectively. By contrast, the sensitivity of the CNN based on contrast-enhanced T1WI was 85%; 100% for radiologist 1; and 96% for radiologist 2. The specificity was 75%, 56, and 58%, respectively. The accuracy was 81, 82, and 81%, respectively.
    CONCLUSIONS: We have successfully differentiated spinal schwannomas and meningiomas using the CNN with high diagnostic accuracy comparable to that of experienced radiologists.
    LEVEL OF EVIDENCE: 4.
    DOI:  https://doi.org/10.1097/BRS.0000000000003353
  15. JAMA Dermatol. 2019 Dec 04.
    Han SS, Moon IJ, Lim W, Suh IS, Lee SY, Na JI, Kim SH, Chang SE.
      Importance: Detection of cutaneous cancer on the face using deep-learning algorithms has been challenging because various anatomic structures create curves and shades that confuse the algorithm and can potentially lead to false-positive results.Objective: To evaluate whether an algorithm can automatically locate suspected areas and predict the probability of a lesion being malignant.
    Design, Setting, and Participants: Region-based convolutional neural network technology was used to create 924 538 possible lesions by extracting nodular benign lesions from 182 348 clinical photographs. After manually or automatically annotating these possible lesions based on image findings, convolutional neural networks were trained with 1 106 886 image crops to locate and diagnose cancer. Validation data sets (2844 images from 673 patients; mean [SD] age, 58.2 [19.9] years; 308 men [45.8%]; 185 patients with malignant tumors, 305 with benign tumors, and 183 free of tumor) were obtained from 3 hospitals between January 1, 2010, and September 30, 2018.
    Main Outcomes and Measures: The area under the receiver operating characteristic curve, F1 score (mean of precision and recall; range, 0.000-1.000), and Youden index score (sensitivity + specificity -1; 0%-100%) were used to compare the performance of the algorithm with that of the participants.
    Results: The algorithm analyzed a mean (SD) of 4.2 (2.4) photographs per patient and reported the malignancy score according to the highest malignancy output. The area under the receiver operating characteristic curve for the validation data set (673 patients) was 0.910. At a high-sensitivity cutoff threshold, the sensitivity and specificity of the model with the 673 patients were 76.8% and 90.6%, respectively. With the test partition (325 images; 80 patients), the performance of the algorithm was compared with the performance of 13 board-certified dermatologists, 34 dermatology residents, 20 nondermatologic physicians, and 52 members of the general public with no medical background. When the disease screening performance was evaluated at high sensitivity areas using the F1 score and Youden index score, the algorithm showed a higher F1 score (0.831 vs 0.653 [0.126], P < .001) and Youden index score (0.675 vs 0.417 [0.124], P < .001) than that of nondermatologic physicians. The accuracy of the algorithm was comparable with that of dermatologists (F1 score, 0.831 vs 0.835 [0.040]; Youden index score, 0.675 vs 0.671 [0.100]).
    Conclusions and Relevance: The results of the study suggest that the algorithm could localize and diagnose skin cancer without preselection of suspicious lesions by dermatologists.
    DOI:  https://doi.org/10.1001/jamadermatol.2019.3807
  16. Best Pract Res Clin Rheumatol. 2019 Aug;pii: S1521-6942(19)30098-1. [Epub ahead of print]33(4): 101429
    Kataria S, Ravindran V.
      Digital health or eHealth technologies, notably pervasive computing, robotics, big-data, wearable devices, machine learning, and artificial intelligence (AI), have opened unprecedented opportunities as to how the diseases are diagnosed and managed with active patient engagement. Patient-related data have provided insights (real world data) into understanding the disease processes. Advanced analytics have refined these insights further to draw dynamic algorithms aiding clinicians in making more accurate diagnosis with the help of machine learning. AI is another tool, which, although is still in the evolution stage, has the potential to help identify early signs even before the clinical features are apparent. The evolving digital developments pose challenges on allowing access to health-related data for further research but, at the same time, protecting each patient's privacy. This review focuses on the recent technological advances and their applications and highlights the immense potential to enable early diagnosis of rheumatological diseases.
    Keywords:  Artificial intelligence; Big data; Data analytics; Digital health; Machine learning; Robotics; Wearable devices
    DOI:  https://doi.org/10.1016/j.berh.2019.101429
  17. Curr Opin Neurol. 2019 Nov 26.
    Milea D, Singhal S, Najjar RP.
      PURPOSE OF REVIEW: The aim of this review is to highlight novel artificial intelligence-based methods for the detection of optic disc abnormalities, with particular focus on neurology and neuro-ophthalmology.RECENT FINDINGS: Methods for detection of optic disc abnormalities on retinal fundus images have evolved considerably over the last few years, from classical ophthalmoscopy to artificial intelligence-based identification methods being applied to retinal imaging with the aim of predicting sight and life-threatening complications of underlying brain or optic nerve conditions.
    SUMMARY: Artificial intelligence and in particular newly developed deep-learning systems are playing an increasingly important role for the detection and classification of acquired neuro-ophthalmic optic disc abnormalities on ocular fundus images. The implementation of automatic deep-learning methods for detection of abnormal optic discs, coupled with innovative hardware solutions for fundus imaging, could revolutionize the practice of neurologists and other non-ophthalmic healthcare providers.
    DOI:  https://doi.org/10.1097/WCO.0000000000000773
  18. Eur Radiol. 2019 Dec 06.
    Chassagnon G, Vakalopolou M, Paragios N, Revel MP.
      Relevance and penetration of machine learning in clinical practice is a recent phenomenon with multiple applications being currently under development. Deep learning-and especially convolutional neural networks (CNNs)-is a subset of machine learning, which has recently entered the field of thoracic imaging. The structure of neural networks, organized in multiple layers, allows them to address complex tasks. For several clinical situations, CNNs have demonstrated superior performance as compared with classical machine learning algorithms and in some cases achieved comparable or better performance than clinical experts. Chest radiography, a high-volume procedure, is a natural application domain because of the large amount of stored images and reports facilitating the training of deep learning algorithms. Several algorithms for automated reporting have been developed. The training of deep learning algorithm CT images is more complex due to the dimension, variability, and complexity of the 3D signal. The role of these methods is likely to increase in clinical practice as a complement of the radiologist's expertise. The objective of this review is to provide definitions for understanding the methods and their potential applications for thoracic imaging. KEY POINTS: • Deep learning outperforms other machine learning techniques for number of tasks in radiology. • Convolutional neural network is the most popular deep learning architecture in medical imaging. • Numerous deep learning algorithms are being currently developed; some of them may become part of clinical routine in the near future.
    Keywords:  Deep learning; Lung; Machine learning; Thorax
    DOI:  https://doi.org/10.1007/s00330-019-06564-3
  19. Clin Radiol. 2019 Nov 29. pii: S0009-9260(19)30640-3. [Epub ahead of print]
    Yu JS, Yu SM, Erdal BS, Demirer M, Gupta V, Bigelow M, Salvador A, Rink T, Lenobel SS, Prevedello LM, White RD.
      AIM: To investigate the feasibility of applying a deep convolutional neural network (CNN) for detection/localisation of acute proximal femoral fractures (APFFs) on hip radiographs.MATERIALS AND METHODS: This study had institutional review board approval. Radiographs of 307 patients with APFFs and 310 normal patients were identified. A split ratio of 3/1/1 was used to create training, validation, and test datasets. To test the validity of the proposed model, a 20-fold cross-validation was performed. The anonymised images from the test cohort were shown to two groups of radiologists: musculoskeletal radiologists and diagnostic radiology residents. Each reader was asked to assess if there was a fracture and localise it if one was detected. The area under the receiver operator characteristics curve (AUC), sensitivity, and specificity were calculated for the CNN and readers.
    RESULTS: The mean AUC was 0.9944 with a standard deviation of 0.0036. Mean sensitivity and specificity for fracture detection was 97.1% (81.5/84) and 96.7% (118/122), respectively. There was good concordance with saliency maps for lesion identification, but sensitivity was lower for characterising location (subcapital/transcervical, 84.1%; basicervical/intertrochanteric, 77%; subtrochanteric, 20%). Musculoskeletal radiologists showed a sensitivity and specificity for fracture detection of 100% and 100% respectively, while residents showed 100% and 96.8%, respectively. For fracture localisation, the performance decreased slightly for human readers.
    CONCLUSION: The proposed CNN algorithm showed high accuracy for detection of APFFs, but the performance was lower for fracture localisation. Overall performance of the CNN was lower than that of radiologists, especially in localizing fracture location.
    DOI:  https://doi.org/10.1016/j.crad.2019.10.022
  20. BMC Med Inform Decis Mak. 2019 Dec 02. 19(1): 248
    Ford E, Rooney P, Oliver S, Hoile R, Hurley P, Banerjee S, van Marwijk H, Cassell J.
      BACKGROUND: Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP.METHODS: We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination.
    RESULTS: The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing.
    CONCLUSIONS: Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.
    Keywords:  Dementia; Diagnosis; Early detection; Electronic health records; General practice; Machine learning; Prediction; Primary care
    DOI:  https://doi.org/10.1186/s12911-019-0991-9
  21. J Affect Disord. 2019 Nov 13. pii: S0165-0327(19)31141-3. [Epub ahead of print]
    Miché M, Studerus E, Meyer AH, Gloster AT, Beesdo-Baum K, Wittchen HU, Lieb R.
      BACKGROUND: The use of machine learning (ML) algorithms to study suicidality has recently been recommended. Our aim was to explore whether ML approaches have the potential to improve the prediction of suicide attempt (SA) risk. Using the epidemiological multiwave prospective-longitudinal Early Developmental Stages of Psychopathology (EDSP) data set, we compared four algorithms-logistic regression, lasso, ridge, and random forest-in predicting a future SA in a community sample of adolescents and young adults.METHODS: The EDSP Study prospectively assessed, over the course of 10 years, adolescents and young adults aged 14-24 years at baseline. Of 3021 subjects, 2797 were eligible for prospective analyses because they participated in at least one of the three follow-up assessments. Sixteen baseline predictors, all selected a priori from the literature, were used to predict follow-up SAs. Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance we used the area under the curve (AUC).
    RESULTS: The mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.828, 0.826, 0.829, and 0.824, respectively.
    CONCLUSIONS: Based on our comparison, each algorithm performed equally well in distinguishing between a future SA case and a non-SA case in community adolescents and young adults. When choosing an algorithm, different considerations, however, such as ease of implementation, might in some instances lead to one algorithm being prioritized over another. Further research and replication studies are required in this regard.
    Keywords:  Adolescents and young adults; Community sample; Future suicide attempt; Machine learning; Prediction; Prospective design
    DOI:  https://doi.org/10.1016/j.jad.2019.11.093
  22. Front Neurosci. 2019 ;13 1203
    Salvador R, Canales-Rodríguez E, Guerrero-Pedraza A, Sarró S, Tordesillas-Gutiérrez D, Maristany T, Crespo-Facorro B, McKenna P, Pomarol-Clotet E.
      Magnetic resonance imaging (MRI) has been proposed as a source of information for automatic prediction of individual diagnosis in schizophrenia. Optimal integration of data from different MRI modalities is an active area of research aimed at increasing diagnostic accuracy. Based on a sample of 96 patients with schizophrenia and a matched sample of 115 healthy controls that had undergone a single multimodal MRI session, we generated individual brain maps of gray matter vbm, 1back, and 2back levels of activation (nback fMRI), maps of amplitude of low-frequency fluctuations (resting-state fMRI), and maps of weighted global brain connectivity (resting-state fMRI). Four unimodal classifiers (Ridge, Lasso, Random Forests, and Gradient boosting) were applied to these maps to evaluate their classification accuracies. Based on the assignments made by the algorithms on test individuals, we quantified the amount of predictive information shared between maps (what we call redundancy analysis). Finally, we explored the added accuracy provided by a set of multimodal strategies that included post-classification integration based on probabilities, two-step sequential integration, and voxel-level multimodal integration through one-dimensional-convolutional neural networks (1D-CNNs). All four unimodal classifiers showed the highest test accuracies with the 2back maps (80% on average) achieving a maximum of 84% with the Lasso. Redundancy levels between brain maps were generally low (overall mean redundancy score of 0.14 in a 0-1 range), indicating that each brain map contained differential predictive information. The highest multimodal accuracy was delivered by the two-step Ridge classifier (87%) followed by the Ridge maximum and mean probability classifiers (both with 85% accuracy) and by the 1D-CNN, which achieved the same accuracy as the best unimodal classifier (84%). From these results, we conclude that from all MRI modalities evaluated task-based fMRI may be the best unimodal diagnostic option in schizophrenia. Low redundancy values point to ample potential for accuracy improvements through multimodal integration, with the two-step Ridge emerging as a suitable strategy.
    Keywords:  computer-aided diagnosis; convolutional neural network; lasso; machine learning; multimodal integration; ridge; schizophrenia
    DOI:  https://doi.org/10.3389/fnins.2019.01203