bims-arihec Biomed News
on Artificial Intelligence in Healthcare
Issue of 2020‒02‒02
twenty papers selected by
Céline Bélanger
Cogniges Inc.


  1. Curr Rev Musculoskelet Med. 2020 Jan 25.
    Helm JM, Swiergosz AM, Haeberle HS, Karnuta JM, Schaffer JL, Krebs VE, Spitzer AI, Ramkumar PN.
      PURPOSE OF REVIEW: With the unprecedented advancement of data aggregation and deep learning algorithms, artificial intelligence (AI) and machine learning (ML) are poised to transform the practice of medicine. The field of orthopedics, in particular, is uniquely suited to harness the power of big data, and in doing so provide critical insight into elevating the many facets of care provided by orthopedic surgeons. The purpose of this review is to critically evaluate the recent and novel literature regarding ML in the field of orthopedics and to address its potential impact on the future of musculoskeletal care.RECENT FINDINGS: Recent literature demonstrates that the incorporation of ML into orthopedics has the potential to elevate patient care through alternative patient-specific payment models, rapidly analyze imaging modalities, and remotely monitor patients. Just as the business of medicine was once considered outside the domain of the orthopedic surgeon, we report evidence that demonstrates these emerging applications of AI warrant ownership, leverage, and application by the orthopedic surgeon to better serve their patients and deliver optimal, value-based care.
    Keywords:  Artificial intelligence; Big data; Machine learning; Patient-specific payment models; Remote patient monitoring systems; Value-based care
    DOI:  https://doi.org/10.1007/s12178-020-09600-8
  2. Eur Radiol. 2020 Jan 30.
    Cui E, Li Z, Ma C, Li Q, Lei Y, Lan Y, Yu J, Zhou Z, Li R, Long W, Lin F.
      OBJECTIVE: To investigate externally validated magnetic resonance (MR)-based and computed tomography (CT)-based machine learning (ML) models for grading clear cell renal cell carcinoma (ccRCC).MATERIALS AND METHODS: Patients with pathologically proven ccRCC in 2009-2018 were retrospectively included for model development and internal validation; patients from another independent institution and The Cancer Imaging Archive dataset were included for external validation. Features were extracted from T1-weighted, T2-weighted, corticomedullary-phase (CMP), and nephrographic-phase (NP) MR as well as precontrast-phase (PCP), CMP, and NP CT. CatBoost was used for ML-model investigation. The reproducibility of texture features was assessed using intraclass correlation coefficient (ICC). Accuracy (ACC) was used for ML-model performance evaluation.
    RESULTS: Twenty external and 440 internal cases were included. Among 368 and 276 texture features from MR and CT, 322 and 250 features with good to excellent reproducibility (ICC ≥ 0.75) were included for ML-model development. The best MR- and CT-based ML models satisfactorily distinguished high- from low-grade ccRCCs in internal (MR-ACC = 73% and CT-ACC = 79%) and external (MR-ACC = 74% and CT-ACC = 69%) validation. Compared to single-sequence or single-phase images, the classifiers based on all-sequence MR (71% to 73% in internal and 64% to 74% in external validation) and all-phase CT (77% to 79% in internal and 61% to 69% in external validation) images had significant increases in ACC.
    CONCLUSIONS: MR- and CT-based ML models are valuable noninvasive techniques for discriminating high- from low-grade ccRCCs, and multiparameter MR- and multiphase CT-based classifiers are potentially superior to those based on single-sequence or single-phase imaging.
    KEY POINTS: • Both the MR- and CT-based machine learning models are reliable predictors for differentiating high- from low-grade ccRCCs. • ML models based on multiparameter MR sequences and multiphase CT images potentially outperform those based on single-sequence or single-phase images in ccRCC grading.
    Keywords:  Artificial intelligence; Clear cell renal cell carcinoma; Machine learning; Radiomics; Tumor grading
    DOI:  https://doi.org/10.1007/s00330-019-06601-1
  3. HRB Open Res. 2019 ;2 13
    Storick V, O'Herlihy A, Abdelhafeez S, Ahmed R, May P.
      Introduction: Improving end-of-life (EOL) care is a priority worldwide as this population experiences poor outcomes and accounts disproportionately for costs. In clinical practice, physician judgement is the core method of identifying EOL care needs but has important limitations. Machine learning (ML) is a subset of artificial intelligence advancing capacity to identify patterns and make predictions using large datasets.  ML approaches have the potential to improve clinical decision-making and policy design, but there has been no systematic assembly of current evidence. Methods: We conducted a rapid review, searching systematically seven databases from inception to December 31st, 2018: EMBASE, MEDLINE, Cochrane Library, PsycINFO, WOS, SCOPUS and ECONLIT.  We included peer-reviewed studies that used ML approaches on routine data to improve palliative and EOL care for adults.  Our specified outcomes were survival, quality of life (QoL), place of death, costs, and receipt of high-intensity treatment near end of life.  We did not search grey literature and excluded material that was not a peer-reviewed article. Results: The database search identified 426 citations. We discarded 162 duplicates and screened 264 unique title/abstracts, of which 22 were forwarded for full text review.  Three papers were included, 18 papers were excluded and one full text was sought but unobtainable.  One paper predicted six-month mortality, one paper predicted 12-month mortality and one paper cross-referenced predicted 12-month mortality with healthcare spending.  ML-informed models outperformed logistic regression in predicting mortality but poor prognosis is a weak driver of costs.  Models using only routine administrative data had limited benefit from ML methods. Conclusion: While ML can in principle help to identify those at risk of adverse outcomes and inappropriate treatment near EOL, applications to policy and practice are formative.  Future research must not only expand scope to other outcomes and longer timeframes, but also engage with individual preferences and ethical challenges.
    Keywords:  Machine learning; artificial intelligence; costs; decision-making; multimorbidity; palliative care; quality of life; terminal care
    DOI:  https://doi.org/10.12688/hrbopenres.12923.1
  4. JAMIA Open. 2018 Jul;1(1): 87-98
    Tang F, Xiao C, Wang F, Zhou J.
      Objective: The growing availability of rich clinical data such as patients' electronic health records provide great opportunities to address a broad range of real-world questions in medicine. At the same time, artificial intelligence and machine learning (ML)-based approaches have shown great premise on extracting insights from those data and helping with various clinical problems. The goal of this study is to conduct a systematic comparative study of different ML algorithms for several predictive modeling problems in urgent care.Design: We assess the performance of 4 benchmark prediction tasks (eg mortality and prediction, differential diagnostics, and disease marker discovery) using medical histories, physiological time-series, and demographics data from the Medical Information Mart for Intensive Care (MIMIC-III) database.
    Measurements: For each given task, performance was estimated using standard measures including the area under the receiver operating characteristic (AUC) curve, F-1 score, sensitivity, and specificity. Microaveraged AUC was used for multiclass classification models.
    Results and Discussion: Our results suggest that recurrent neural networks show the most promise in mortality prediction where temporal patterns in physiologic features alone can capture in-hospital mortality risk (AUC > 0.90). Temporal models did not provide additional benefit compared to deep models in differential diagnostics. When comparing the training-testing behaviors of readmission and mortality models, we illustrate that readmission risk may be independent of patient stability at discharge. We also introduce a multiclass prediction scheme for length of stay which preserves sensitivity and AUC with outliers of increasing duration despite decrease in sample size.
    Keywords:  machine learning; predictive modeling; urgent care
    DOI:  https://doi.org/10.1093/jamiaopen/ooy011
  5. Stroke. 2020 Jan 28. STROKEAHA119027611
    Lee H, Lee EJ, Ham S, Lee HB, Lee JS, Kwon SU, Kim JS, Kim N, Kang DW.
      Background and Purpose- We aimed to investigate the ability of machine learning (ML) techniques analyzing diffusion-weighted imaging (DWI) and fluid-attenuated inversion recovery (FLAIR) magnetic resonance imaging to identify patients within the recommended time window for thrombolysis. Methods- We analyzed DWI and FLAIR images of consecutive patients with acute ischemic stroke within 24 hours of clear symptom onset by applying automatic image processing approaches. These processes included infarct segmentation, DWI, and FLAIR imaging registration and image feature extraction. A total of 89 vector features from each image sequence were captured and used in the ML. Three ML models were developed to estimate stroke onset time for binary classification (≤4.5 hours): logistic regression, support vector machine, and random forest. To evaluate the performance of ML models, the sensitivity and specificity for identifying patients within 4.5 hours were compared with the sensitivity and specificity of human readings of DWI-FLAIR mismatch. Results- Data from a total of 355 patients were analyzed. DWI-FLAIR mismatch from human readings identified patients within 4.5 hours of symptom onset with 48.5% sensitivity and 91.3% specificity. ML algorithms had significantly greater sensitivities than human readers (75.8% for logistic regression, P=0.020; 72.7% for support vector machine, P=0.033; 75.8% for random forest, P=0.013) in detecting patients within 4.5 hours, but their specificities were comparable (82.6% for logistic regression, P=0.157; 82.6% for support vector machine, P=0.157; 82.6% for random forest, P=0.157). Conclusions- ML algorithms using multiple magnetic resonance imaging features were feasible even more sensitive than human readings in identifying patients with stroke within the time window for acute thrombolysis.
    Keywords:  artificial intelligence; humans; machine learning; magnetic resonance imaging; stroke
    DOI:  https://doi.org/10.1161/STROKEAHA.119.027611
  6. J Am Coll Radiol. 2020 Jan 28. pii: S1546-1440(20)30003-X. [Epub ahead of print]
    Sorin V, Barash Y, Konen E, Klang E.
      PURPOSE: Natural language processing (NLP) enables conversion of free text into structured data. Recent innovations in deep learning technology provide improved NLP performance. We aimed to survey deep learning NLP fundamentals and review radiology-related research.METHODS: This systematic review was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. We searched for deep learning NLP radiology studies published up to September 2019. MEDLINE, Scopus, and Google Scholar were used as search databases.
    RESULTS: Ten relevant studies published between 2017 and 2019 were identified. Deep learning models applied for NLP in radiology are convolutional neural networks, recurrent neural networks, long short-term memory networks, and attention networks. Deep learning NLP applications in radiology include flagging of diagnoses such as pulmonary embolisms and fractures, labeling follow-up recommendations, and automatic selection of imaging protocols. Deep learning NLP models perform as well as or better than traditional NLP models.
    CONCLUSION: Research and use of deep learning NLP in radiology is increasing. Acquaintance with this technology can help prepare radiologists for the coming changes in their field.
    Keywords:  Convolutional neural networks; deep learning; machine learning; natural language processing; radiology
    DOI:  https://doi.org/10.1016/j.jacr.2019.12.026
  7. Laryngoscope. 2020 Jan 28.
    Formeister EJ, Baum R, Knott PD, Seth R, Ha P, Ryan W, El-Sayed I, George J, Larson A, Plonowska K, Heaton C.
      OBJECTIVES/HYPOTHESIS: Machine learning (ML) is a type of artificial intelligence wherein a computer learns patterns and associations between variables to correctly predict outcomes. The objectives of this study were to 1) use a ML platform to identify factors important in predicting surgical complications in patients undergoing head and neck free tissue transfer, and 2) compare ML outputs to traditionally employed logistic regression models.STUDY DESIGN: Retrospective cohort study.
    METHODS: Using a dataset of 364 consecutive patients who underwent head and neck microvascular free tissue transfer at a single institution, 14 clinicopathologic characteristics were analyzed using a supervised ML algorithm of ensemble decision trees to predict surgical complications. The relative importance values of each variable in the ML analysis were then compared to logistic regression models.
    RESULTS: There were 166 surgical complications, which included bleeding or hematoma in 30 patients (8.2%), fistulae in 25 patients (6.9%), and infection or dehiscence in 52 patients (14.4%). There were 59 take-backs (16.2%), and six total (1.6%) and five partial (1.4%) flap failures. ML models were able to correctly classify outcomes with an accuracy of 65% to 75%. Factors that were identified in ML analyses as most important for predicting complications included institutional experience, flap ischemia time, age, and smoking pack-years. In contrast, the significant factors most frequently identified in traditional logistic regression analyses were patient age (P = .03), flap type (P = .03), and primary site of reconstruction (P = .06).
    CONCLUSIONS: In this single-institution dataset, ML algorithms identified factors for predicting complications after free tissue transfer that were distinct from traditional regression models.
    LEVEL OF EVIDENCE: 2c Laryngoscope, 2020.
    Keywords:  Machine learning; artificial intelligence; head and neck cancer; microvascular free flap
    DOI:  https://doi.org/10.1002/lary.28508
  8. Arch Bronconeumol. 2020 Jan 22. pii: S0300-2896(19)30594-0. [Epub ahead of print]
    Heili-Frades S, Minguez P, Mahillo Fernández I, Jiménez Hiscock L, Santos A, Heili Frades D, Carballosa de Miguel MDP, Fernández Ormaechea I, Álvarez Suárez L, Naya Prieto A, González Mangado N, Peces-Barba Romero G.
      INTRODUCTION: Mortality risk prediction for Intermediate Respiratory Care Unit's (IRCU) patients can facilitate optimal treatment in high-risk patients. While Intensive Care Units (ICUs) have a long term experience in using algorithms for this purpose, due to the special features of the IRCUs, the same strategics are not applicable. The aim of this study is to develop an IRCU specific mortality predictor tool using machine learning methods.METHODS: Vital signs of patients were recorded from 1966 patients admitted from 2007 to 2017 in the Jiménez Díaz Foundation University Hospital's IRCU. A neural network was used to select the variables that better predict mortality status. Multivariate logistic regression provided us cut-off points that best discriminated the mortality status for each of the parameters. A new guideline for risk assessment was applied and mortality was recorded during one year.
    RESULTS: Our algorithm shows that thrombocytopenia, metabolic acidosis, anemia, tachypnea, age, sodium levels, hypoxemia, leukocytopenia and hyperkalemia are the most relevant parameters associated with mortality. First year with this decision scene showed a decrease in failure rate of a 50%.
    CONCLUSIONS: We have generated a neural network model capable of identifying and classifying mortality predictors in the IRCU of a general hospital. Combined with multivariate regression analysis, it has provided us with an useful tool for the real-time monitoring of patients to detect specific mortality risks. The overall algorithm can be scaled to any type of unit offering personalized results and will increase accuracy over time when more patients are included to the cohorts.
    Keywords:  Aprendizaje automático; Artificial intelligence; Inteligencia artificial; Intermediate Respiratory Care Unit; Machine learning; Modelo de predicción de mortalidad; Mortality prediction model; Neural network; Red neuronal; Unidad de cuidados respiratorios intermedios
    DOI:  https://doi.org/10.1016/j.arbres.2019.11.019
  9. Br J Surg. 2020 Jan 30.
    Rahman SA, Walker RC, Lloyd MA, Grace BL, van Boxel GI, Kingma BF, Ruurda JP, van Hillegersberg R, Harris S, Parsons S, Mercer S, Griffiths EA, O'Neill JR, Turkington R, Fitzgerald RC, Underwood TJ, .
      BACKGROUND: Early cancer recurrence after oesophagectomy is a common problem, with an incidence of 20-30 per cent despite the widespread use of neoadjuvant treatment. Quantification of this risk is difficult and existing models perform poorly. This study aimed to develop a predictive model for early recurrence after surgery for oesophageal adenocarcinoma using a large multinational cohort and machine learning approaches.METHODS: Consecutive patients who underwent oesophagectomy for adenocarcinoma and had neoadjuvant treatment in one Dutch and six UK oesophagogastric units were analysed. Using clinical characteristics and postoperative histopathology, models were generated using elastic net regression (ELR) and the machine learning methods random forest (RF) and extreme gradient boosting (XGB). Finally, a combined (ensemble) model of these was generated. The relative importance of factors to outcome was calculated as a percentage contribution to the model.
    RESULTS: A total of 812 patients were included. The recurrence rate at less than 1 year was 29·1 per cent. All of the models demonstrated good discrimination. Internally validated areas under the receiver operating characteristic (ROC) curve (AUCs) were similar, with the ensemble model performing best (AUC 0·791 for ELR, 0·801 for RF, 0·804 for XGB, 0·805 for ensemble). Performance was similar when internal-external validation was used (validation across sites, AUC 0·804 for ensemble). In the final model, the most important variables were number of positive lymph nodes (25·7 per cent) and lymphovascular invasion (16·9 per cent).
    CONCLUSION: The model derived using machine learning approaches and an international data set provided excellent performance in quantifying the risk of early recurrence after surgery, and will be useful in prognostication for clinicians and patients.
    DOI:  https://doi.org/10.1002/bjs.11461
  10. NPJ Digit Med. 2020 ;3 10
    Ghorbani A, Ouyang D, Abid A, He B, Chen JH, Harrington RA, Liang DH, Ashley EA, Zou JY.
      Echocardiography uses ultrasound technology to capture high temporal and spatial resolution images of the heart and surrounding structures, and is the most common imaging modality in cardiovascular medicine. Using convolutional neural networks on a large new dataset, we show that deep learning applied to echocardiography can identify local cardiac structures, estimate cardiac function, and predict systemic phenotypes that modify cardiovascular risk but not readily identifiable to human interpretation. Our deep learning model, EchoNet, accurately identified the presence of pacemaker leads (AUC = 0.89), enlarged left atrium (AUC = 0.86), left ventricular hypertrophy (AUC = 0.75), left ventricular end systolic and diastolic volumes ( R 2  = 0.74 and R 2  = 0.70), and ejection fraction ( R 2  = 0.50), as well as predicted systemic phenotypes of age ( R 2  = 0.46), sex (AUC = 0.88), weight ( R 2  = 0.56), and height ( R 2  = 0.33). Interpretation analysis validates that EchoNet shows appropriate attention to key cardiac structures when performing human-explainable tasks and highlights hypothesis-generating regions of interest when predicting systemic phenotypes difficult for human interpretation. Machine learning on echocardiography images can streamline repetitive tasks in the clinical workflow, provide preliminary interpretation in areas with insufficient qualified cardiologists, and predict phenotypes challenging for human evaluation.
    Keywords:  Cardiovascular diseases; Image processing; Machine learning
    DOI:  https://doi.org/10.1038/s41746-019-0216-8
  11. Sarcoma. 2020 ;2020 7163453
    Malinauskaite I, Hofmeister J, Burgermeister S, Neroladaki A, Hamard M, Montet X, Boudabbous S.
      Distinguishing lipoma from liposarcoma is challenging on conventional MRI examination. In case of uncertain diagnosis following MRI, further invasive procedure (percutaneous biopsy or surgery) is often required to allow for diagnosis based on histopathological examination. Radiomics and machine learning allow for several types of pathologies encountered on radiological images to be automatically and reliably distinguished. The aim of the study was to assess the contribution of radiomics and machine learning in the differentiation between soft-tissue lipoma and liposarcoma on preoperative MRI and to assess the diagnostic accuracy of a machine-learning model compared to musculoskeletal radiologists. 86 radiomics features were retrospectively extracted from volume-of-interest on T1-weighted spin-echo 1.5 and 3.0 Tesla MRI of 38 soft-tissue tumors (24 lipomas and 14 liposarcomas, based on histopathological diagnosis). These radiomics features were then used to train a machine-learning classifier to distinguish lipoma and liposarcoma. The generalization performance of the machine-learning model was assessed using Monte-Carlo cross-validation and receiver operating characteristic curve analysis (ROC-AUC). Finally, the performance of the machine-learning model was compared to the accuracy of three specialized musculoskeletal radiologists using the McNemar test. Machine-learning classifier accurately distinguished lipoma and liposarcoma, with a ROC-AUC of 0.926. Notably, it performed better than the three specialized musculoskeletal radiologists reviewing the same patients, who achieved ROC-AUC of 0.685, 0.805, and 0.785. Despite being developed on few cases, the trained machine-learning classifier accurately distinguishes lipoma and liposarcoma on preoperative MRI, with better performance than specialized musculoskeletal radiologists.
    DOI:  https://doi.org/10.1155/2020/7163453
  12. Comput Biol Med. 2020 Jan;pii: S0010-4825(19)30424-X. [Epub ahead of print]116 103569
    Hsu KC, Lin CH, Johnson KR, Liu CH, Chang TY, Huang KL, Fann YC, Lee TH.
      BACKGROUND: and Purpose: This study proposed a machine learning method for identifying ≥50% stenosis of the extracranial and intracranial arteries.PATIENTS AND METHODS: A total of 8211 patients with both carotid ultrasound and cerebral angiography were enrolled. Support vector machine (SVM) was employed as the machine learning classifier. Carotid Doppler parameters and transcranial Doppler parameters were used as the input features. Feature selection was performed using the Extra-Trees (extremely randomized trees) method.
    RESULTS: For the machine learning method, the sensitivities and specificities of identifying stenosis of the extracranial arteries were 88.5%-100% and 96.0%-100%, respectively. The sensitivities and specificities of identifying stenosis of the intracranial arteries were 71.7%-100% and 88.9%-100%, respectively.
    CONCLUSIONS: The SVM classifier with feature selection is an efficient method for identifying the stenosis of both intracranial and extracranial arteries. Comparing with traditional Doppler criteria, this machine learning method achieves up to 20% higher in accuracy and 45% in sensitivity, respectively.
    Keywords:  Angiography; Carotid ultrasound; Intracranial artery stenosis; Machine learning
    DOI:  https://doi.org/10.1016/j.compbiomed.2019.103569
  13. JAMIA Open. 2018 Oct;1(2): 172-177
    Roth JA, Goebel N, Sakoparnig T, Neubauer S, Kuenzel-Pawlik E, Gerber M, Widmer AF, Abshagen C, Padiyath R, Hug BL, .
      We describe a scalable platform for research-oriented analyses of routine data in hospitals, which evolved from a state-of-the-art business intelligence architecture for enterprise resource planning. This platform involves an in-memory database management system for data modeling and analytics and a high-performance cluster for more computing-intensive analytical tasks. Setting up platforms for research-oriented analyses is a highly dynamic, time-consuming, and costly process. In some health care institutions, effective research platforms may be derived from existing business intelligence systems.
    Keywords:  database management systems; health information systems; health services research; high performance analytic appliance (HANA); machine learning
    DOI:  https://doi.org/10.1093/jamiaopen/ooy039
  14. Colorectal Dis. 2020 Jan 28.
    Zhao B, Gabriel RA, Vaida F, Eisenstein S, Schnickel GT, Sicklick JK, Clary BM.
      AIM: Patients with synchronous colon cancer metastases have highly variable overall survival (OS), making accurate predictive models challenging to build. We aim to use machine learning to more accurately predict OS in these patients and to present this predictive model in the form of nomograms for patients and clinicians.METHODS: Using the National Cancer Database (2010-2014), we identified right colon (RC) and left colon (LC) cancer patients with synchronous metastases. Each primary site was split into training and testing datasets. Nomograms predicting 3-year overall survival were created for each site using Cox proportional hazard regression with lasso regression. Each model was evaluated by both calibration (comparison of predicted versus observed overall survival) and validation (degree of concordance as measured by c-index) methodologies.
    RESULTS: A total of 11,018 RC and 8,346 LC patients were used to construct and validate the nomograms. After stratifying each model into 5 risk groups, the predicted OS was within the 95% CI of the observed OS in 4 out of 5 risk groups for both the RC and LC models. Externally validated c-indexes at 3 years for RC and LC models were 0.794 and 0.761, respectively.
    CONCLUSIONS: Utilization of machine learning can result in more accurate predictive models for patients with metastatic colon cancer. Nomograms built from these models can assist clinicians and patients in the shared decision-making process of their cancer care.
    DOI:  https://doi.org/10.1111/codi.14991
  15. Eur Radiol. 2020 Jan 31.
    Wu G, Woodruff HC, Sanduleanu S, Refaee T, Jochems A, Leijenaar R, Gietema H, Shen J, Wang R, Xiong J, Bian J, Wu J, Lambin P.
      OBJECTIVES: Develop a CT-based radiomics model and combine it with frozen section (FS) and clinical data to distinguish invasive adenocarcinomas (IA) from preinvasive lesions/minimally invasive adenocarcinomas (PM).METHODS: This multicenter study cohort of 623 lung adenocarcinomas was split into training (n = 331), testing (n = 143), and external validation dataset (n = 149). Random forest models were built using selected radiomics features, results from FS, lesion volume, clinical and semantic features, and combinations thereof. The area under the receiver operator characteristic curves (AUC) was used to evaluate model performances. The diagnosis accuracy, calibration, and decision curves of models were tested.
    RESULTS: The radiomics-based model shows good predictive performance and diagnostic accuracy for distinguishing IA from PM, with AUCs of 0.89, 0.89, and 0.88, in the training, testing, and validation datasets, respectively, and with corresponding accuracies of 0.82, 0.79, and 0.85. Adding lesion volume and FS significantly increases the performance of the model with AUCs of 0.96, 0.97, and 0.96, and with accuracies of 0.91, 0.94, and 0.93 in the three datasets. There is no significant difference in AUC between the FS model enriched with radiomics and volume against an FS model enriched with volume alone, while the former has higher accuracy. The model combining all available information shows minor non-significant improvements in AUC and accuracy compared with an FS model enriched with radiomics and volume.
    CONCLUSIONS: Radiomics signatures are potential biomarkers for the risk of IA, especially in combination with FS, and could help guide surgical strategy for pulmonary nodules patients.
    KEY POINTS: • A CT-based radiomics model may be a valuable tool for preoperative prediction of invasive adenocarcinoma for patients with pulmonary nodules. • Radiomics combined with frozen sections could help in guiding surgery strategy for patients with pulmonary nodules.
    Keywords:  Adenocarcinoma of lung; Carcinoma, non-small-cell lung; Frozen sections; Machine learning; Tomography, spiral computed
    DOI:  https://doi.org/10.1007/s00330-019-06597-8
  16. Biol Psychiatry Cogn Neurosci Neuroimaging. 2019 Nov 27. pii: S2451-9022(19)30304-0. [Epub ahead of print]
    Nielsen AN, Barch DM, Petersen SE, Schlaggar BL, Greene DJ.
      Psychiatric disorders are complex, involving heterogeneous symptomatology and neurobiology that rarely involves the disruption of single, isolated brain structures. In an attempt to better describe and understand the complexities of psychiatric disorders, investigators have increasingly applied multivariate pattern classification approaches to neuroimaging data and in particular supervised machine learning methods. However, supervised machine learning approaches also come with unique challenges and trade-offs, requiring additional study design and interpretation considerations. The goal of this review is to provide a set of best practices for evaluating machine learning applications to psychiatric disorders. We discuss how to evaluate two common efforts: 1) making predictions that have the potential to aid in diagnosis, prognosis, and treatment and 2) interrogating the complex neurophysiological mechanisms underlying psychopathology. We focus here on machine learning as applied to functional connectivity with magnetic resonance imaging, as an example to ground discussion. We argue that for machine learning classification to have translational utility for individual-level predictions, investigators must ensure that the classification is clinically informative, independent of confounding variables, and appropriately assessed for both performance and generalizability. We contend that shedding light on the complex mechanisms underlying psychiatric disorders will require consideration of the unique utility, interpretability, and reliability of the neuroimaging features (e.g., regions, networks, connections) identified from machine learning approaches. Finally, we discuss how the rise of large, multisite, publicly available datasets may contribute to the utility of machine learning approaches in psychiatry.
    Keywords:  Computational psychiatry; Feature selection; Functional connectivity; Machine learning; Neurophysiological mechanisms; Prediction
    DOI:  https://doi.org/10.1016/j.bpsc.2019.11.007
  17. Ocul Oncol Pathol. 2020 Jan;6(1): 58-65
    Kessel K, Mattila J, Linder N, Kivelä T, Lundin J.
      Objectives: The aim of this study was to train and validate deep learning algorithms to quantitate relative amyloid deposition (RAD; mean amyloid deposited area per stromal area) in corneal sections from patients with familial amyloidosis, Finnish (FAF), and assess its relationship with visual acuity.Methods: Corneal specimens were obtained from 42 patients undergoing penetrating keratoplasty, stained with Congo red, and digitally scanned. Areas of amyloid deposits and areas of stromal tissue were labeled on a pixel level for training and validation. The algorithms were used to quantify RAD in each cornea, and the association of RAD with visual acuity was assessed.
    Results: In the validation of the amyloid area classification, sensitivity was 86%, specificity 92%, and F-score 81. For corneal stromal area classification, sensitivity was 74%, specificity 82%, and F-score 73. There was insufficient evidence to demonstrate correlation (Spearman's rank correlation, -0.264, p = 0.091) between RAD and visual acuity (logMAR).
    Conclusions: Deep learning algorithms can achieve a high sensitivity and specificity in pixel-level classification of amyloid and corneal stromal area. Further modeling and development of algorithms to assess earlier stages of deposition from clinical images is necessary to better assess the correlation between amyloid deposition and visual acuity. The method might be applied to corneal dystrophies as well.
    Keywords:  Corneal amyloidosis; Familial amyloidosis, Finnish; Gelsolin; Machine learning; Meretoja syndrome
    DOI:  https://doi.org/10.1159/000500896
  18. PLoS One. 2020 ;15(1): e0228446
    Ellmann S, Wenkel E, Dietzel M, Bielowski C, Vesal S, Maier A, Hammon M, Janka R, Fasching PA, Beckmann MW, Schulz Wendtland R, Uder M, Bäuerle T.
      We investigated whether the integration of machine learning (ML) into MRI interpretation can provide accurate decision rules for the management of suspicious breast masses. A total of 173 consecutive patients with suspicious breast masses upon complementary assessment (BI-RADS IV/V: n = 100/76) received standardized breast MRI prior to histological verification. MRI findings were independently assessed by two observers (R1/R2: 5 years of experience/no experience in breast MRI) using six (semi-)quantitative imaging parameters. Interobserver variability was studied by ICC (intraclass correlation coefficient). A polynomial kernel function support vector machine was trained to differentiate between benign and malignant lesions based on the six imaging parameters and patient age. Ten-fold cross-validation was applied to prevent overfitting. Overall diagnostic accuracy and decision rules (rule-out criteria) to accurately exclude malignancy were evaluated. Results were integrated into a web application and published online. Malignant lesions were present in 107 patients (60.8%). Imaging features showed excellent interobserver variability (ICC: 0.81-0.98) with variable diagnostic accuracy (AUC: 0.65-0.82). Overall performance of the ML algorithm was high (AUC = 90.1%; BI-RADS IV: AUC = 91.6%). The ML algorithm provided decision rules to accurately rule-out malignancy with a false negative rate <1% in 31.3% of the BI-RADS IV cases. Thus, integration of ML into MRI interpretation can provide objective and accurate decision rules for the management of suspicious breast masses, and could help to reduce the number of potentially unnecessary biopsies.
    DOI:  https://doi.org/10.1371/journal.pone.0228446
  19. JAMIA Open. 2019 Oct;2(3): 346-352
    Hong WS, Haimovich AD, Taylor RA.
      Objectives: To predict 72-h and 9-day emergency department (ED) return by using gradient boosting on an expansive set of clinical variables from the electronic health record.Methods: This retrospective study included all adult discharges from a level 1 trauma center ED and a community hospital ED covering the period of March 2013 to July 2017. A total of 1500 variables were extracted for each visit, and samples split randomly into training, validation, and test sets (80%, 10%, and 10%). Gradient boosting models were fit on 3 selections of the data: administrative data (demographics, prior hospital usage, and comorbidity categories), data available at triage, and the full set of data available at discharge. A logistic regression (LR) model built on administrative data was used for baseline comparison. Finally, the top 20 most informative variables identified from the full gradient boosting models were used to build a reduced model for each outcome.
    Results: A total of 330 631 discharges were available for analysis, with 29 058 discharges (8.8%) resulting in 72-h return and 52 748 discharges (16.0%) resulting in 9-day return to either ED. LR models using administrative data yielded test AUCs of 0.69 (95% confidence interval [CI] 0.68-0.70) and 0.71(95% CI 0.70-0.72), while gradient boosting models using administrative data yielded test AUCs of 0.73 (95% CI 0.72-0.74) and 0.74 (95% CI 0.73-0.74) for 72-h and 9-day return, respectively. Gradient boosting models using variables available at triage yielded test AUCs of 0.75 (95% CI 0.74-0.76) and 0.75 (95% CI 0.74-0.75), while those using the full set of variables yielded test AUCs of 0.76 (95% CI 0.75-0.77) and 0.75 (95% CI 0.75-0.76). Reduced models using the top 20 variables yielded test AUCs of 0.73 (95% CI 0.71-0.74) and 0.73 (95% CI 0.72-0.74).
    Discussion and Conclusion: Gradient boosting models leveraging clinical data are superior to LR models built on administrative data at predicting 72-h and 9-day returns.
    Keywords:  decision support techniques; emergency medicine; machine learning
    DOI:  https://doi.org/10.1093/jamiaopen/ooz019
  20. Eur Heart J Cardiovasc Imaging. 2020 Jan 30. pii: jeaa001. [Epub ahead of print]
    Swift AJ, Lu H, Uthoff J, Garg P, Cogliano M, Taylor J, Metherall P, Zhou S, Johns CS, Alabed S, Condliffe RA, Lawrie A, Wild JM, Kiely DG.
      AIMS: Pulmonary arterial hypertension (PAH) is a progressive condition with high mortality. Quantitative cardiovascular magnetic resonance (CMR) imaging metrics in PAH target individual cardiac structures and have diagnostic and prognostic utility but are challenging to acquire. The primary aim of this study was to develop and test a tensor-based machine learning approach to holistically identify diagnostic features in PAH using CMR, and secondarily, visualize and interpret key discriminative features associated with PAH.METHODS AND RESULTS: Consecutive treatment naive patients with PAH or no evidence of pulmonary hypertension (PH), undergoing CMR and right heart catheterization within 48 h, were identified from the ASPIRE registry. A tensor-based machine learning approach, multilinear subspace learning, was developed and the diagnostic accuracy of this approach was compared with standard CMR measurements. Two hundred and twenty patients were identified: 150 with PAH and 70 with no PH. The diagnostic accuracy of the approach was high as assessed by area under the curve at receiver operating characteristic analysis (P < 0.001): 0.92 for PAH, slightly higher than standard CMR metrics. Moreover, establishing the diagnosis using the approach was less time-consuming, being achieved within 10 s. Learnt features were visualized in feature maps with correspondence to cardiac phases, confirming known and also identifying potentially new diagnostic features in PAH.
    CONCLUSION: A tensor-based machine learning approach has been developed and applied to CMR. High diagnostic accuracy has been shown for PAH diagnosis and new learnt features were visualized with diagnostic potential.
    Keywords:  diagnosis; machine learning; pulmonary arterial hypertension; right ventricle; tensor
    DOI:  https://doi.org/10.1093/ehjci/jeaa001