bims-arihec 2020-03-08 papers

bims-arihec

Biomed News

on Artificial intelligence in healthcare

Issue of 2020–03–08
nineteen papers selected by
Céline Bélanger, Cogniges Inc.

The Economic Impact of Artificial Intelligence in Health Care: Systematic Review.
Artificial Intelligence in Global Health -A Framework and Strategy for Adoption and Sustainability.
The impact of machine learning on patient care: A systematic review.
Predicting Intensive Care Unit admission among patients presenting to the emergency department using machine learning and natural language processing.
Opportunities for machine learning to improve surgical ward safety.
Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care.
Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms.
Impact of a deep learning assistant on the histopathologic classification of liver cancer.
ACCURACY OF ARTIFICIAL INTELLIGENCE ON HISTOLOGY PREDICTION AND DETECTION OF COLORECTAL POLYPS: A SYSTEMATIC REVIEW AND META-ANALYSIS.
Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer.
Streamlining multi-omic and artificial intelligence analysis through interrogative biology and baicis for translational precision medicine applications in clinical oncology.
Implementation of artificial intelligence (AI) for lung cancer clinical trial matching in a tertiary cancer center.
Concordance Between Watson for Oncology and a Multidisciplinary Clinical Decision-Making Team for Gastric Cancer and the Prognostic Implications: Retrospective Study.
Improved Accuracy in Optical Diagnosis of Colorectal Polyps Using Convolutional Neural Networks with Visual Explanations.
Artificial intelligence and convolution neural networks assessing mammographic images: a narrative literature review.
Artificial Intelligence in Acute Kidney Injury Risk Prediction.
Comparison of Artificial Intelligence-Based Fully Automatic Chest CT Emphysema Quantification to Pulmonary Function Testing.
Identifying Schizophrenia Using Structural MRI With a Deep Learning Algorithm.
Detection of extremity chronic traumatic osteomyelitis by machine learning based on computed-tomography images: A retrospective study.

J Med Internet Res. 2020 Feb 20. 22(2): e16866

The Economic Impact of Artificial Intelligence in Health Care: Systematic Review.

Justus Wolff, Josch Pauling, Andreas Keck, Jan Baumbach.

   BACKGROUND: Positive economic impact is a key decision factor in making the case for or against investing in an artificial intelligence (AI) solution in the health care industry. It is most relevant for the care provider and insurer as well as for the pharmaceutical and medical technology sectors. Although the broad economic impact of digital health solutions in general has been assessed many times in literature and the benefit for patients and society has also been analyzed, the specific economic impact of AI in health care has been addressed only sporadically.
OBJECTIVE: This study aimed to systematically review and summarize the cost-effectiveness studies dedicated to AI in health care and to assess whether they meet the established quality criteria.
METHODS: In a first step, the quality criteria for economic impact studies were defined based on the established and adapted criteria schemes for cost impact assessments. In a second step, a systematic literature review based on qualitative and quantitative inclusion and exclusion criteria was conducted to identify relevant publications for an in-depth analysis of the economic impact assessment. In a final step, the quality of the identified economic impact studies was evaluated based on the defined quality criteria for cost-effectiveness studies.
RESULTS: Very few publications have thoroughly addressed the economic impact assessment, and the economic assessment quality of the reviewed publications on AI shows severe methodological deficits. Only 6 out of 66 publications could be included in the second step of the analysis based on the inclusion criteria. Out of these 6 studies, none comprised a methodologically complete cost impact analysis. There are two areas for improvement in future studies. First, the initial investment and operational costs for the AI infrastructure and service need to be included. Second, alternatives to achieve similar impact must be evaluated to provide a comprehensive comparison.
CONCLUSIONS: This systematic literature analysis proved that the existing impact assessments show methodological deficits and that upcoming evaluations require more comprehensive economic analyses to enable economic decisions for or against implementing AI technology in health care.

Keywords:  artificial intelligence; cost-benefit analysis; machine learning; telemedicine

DOI:  https://doi.org/10.2196/16866
Int J MCH AIDS. 2020 ;9(1): 121-127

Artificial Intelligence in Global Health -A Framework and Strategy for Adoption and Sustainability.

Trevor D Hadley, Rowland W Pettit, Tahir Malik, Amelia A Khoei, Hamisu M Salihu.

  Artificial Intelligence (AI) applications in medicine have grown considerably in recent years. AI in the forms of Machine Learning, Natural Language Processing, Expert Systems, Planning and Logistics methods, and Image Processing networks provide great analytical aptitude. While AI methods were first conceptualized for radiology, investigations today are established across all medical specialties. The necessity for proper infrastructure, skilled labor, and access to large, well-organized data sets has kept the majority of medical AI applications in higher-income countries. However, critical technological improvements, such as cloud computing and the near-ubiquity of smartphones, have paved the way for use of medical AI applications in resource-poor areas. Global health initiatives (GHI) have already begun to explore ways to leverage medical AI technologies to detect and mitigate public health inequities. For example, AI tools can help optimize vaccine delivery and community healthcare worker routes, thus enabling limited resources to have a maximal impact. Other promising AI tools have demonstrated an ability to: predict burn healing time from smartphone photos; track regions of socioeconomic disparity combined with environmental trends to predict communicable disease outbreaks; and accurately predict pregnancy complications such as birth asphyxia in low resource settings with limited patient clinical data. In this commentary, we discuss the current state of AI-driven GHI and explore relevant lessons from past technology-centered GHI. Additionally, we propose a conceptual framework to guide the development of sustainable strategies for AI-driven GHI, and we outline areas for future research.

Keywords:  AI Framework; AI Strategy; Artificial Intelligence; Global Health; Implementation; Sustainability

DOI:  https://doi.org/10.21106/ijma.296
Artif Intell Med. 2020 Mar;pii: S0933-3657(19)30395-1. [Epub ahead of print]103 101785

The impact of machine learning on patient care: A systematic review.

David Ben-Israel, W Bradley Jacobs, Steve Casha, Stefan Lang, Won Hyung A Ryu, Madeleine de Lotbiniere-Bassett, David W Cadotte.

   BACKGROUND: Despite the expanding use of machine learning (ML) in fields such as finance and marketing, its application in the daily practice of clinical medicine is almost non-existent. In this systematic review, we describe the various areas within clinical medicine that have applied the use of ML to improve patient care.
METHODS: A systematic review was performed in accordance with the PRISMA guidelines using Medline(R), EBM Reviews, Embase, Psych Info, and Cochrane Databases, focusing on human studies that used ML to directly address a clinical problem. Included studies were published from January 1, 2000 to May 1, 2018 and provided metrics on the performance of the utilized ML tool.
RESULTS: A total of 1909 unique publications were reviewed, with 378 retrospective articles and 8 prospective articles meeting inclusion criteria. Retrospective publications were found to be increasing in frequency, with 61 % of articles published within the last 4 years. Prospective articles comprised only 2 % of the articles meeting our inclusion criteria. These studies utilized a prospective cohort design with an average sample size of 531.
CONCLUSION: The majority of literature describing the use of ML in clinical medicine is retrospective in nature and often outlines proof-of-concept approaches to impact patient care. We postulate that identifying and overcoming key translational barriers, including real-time access to clinical data, data security, physician approval of "black box" generated results, and performance evaluation will allow for a fundamental shift in medical practice, where specialized tools will aid the healthcare team in providing better patient care.

Keywords:  Artificial intelligence; Clinical practice; Machine learning; Patient care; Systematic review

DOI:  https://doi.org/10.1016/j.artmed.2019.101785
PLoS One. 2020 ;15(3): e0229331

Predicting Intensive Care Unit admission among patients presenting to the emergency department using machine learning and natural language processing.

Marta Fernandes, Rúben Mendes, Susana M Vieira, Francisca Leite, Carlos Palos, Alistair Johnson, Stan Finkelstein, Steven Horng, Leo Anthony Celi.

The risk stratification of patients in the emergency department begins at triage. It is vital to stratify patients early based on their severity, since undertriage can lead to increased morbidity, mortality and costs. Our aim was to present a new approach to assist healthcare professionals at triage in the stratification of patients and in identifying those with higher risk of ICU admission. Adult patients assigned Manchester Triage System (MTS) or Emergency Severity Index (ESI) 1 to 3 from a Portuguese and a United States Emergency Departments were analyzed. Variables routinely collected at triage were used and natural language processing was applied to the patient chief complaint. Stratified random sampling was applied to split the data in train (70%) and test (30%) sets and 10-fold cross validation was performed for model training. Logistic regression, random forests, and a random undersampling boosting algorithm were used. We compared the performance obtained with the reference model-using only triage priorities-with the models using additional variables. For both hospitals, a logistic regression model achieved higher overall performance, yielding areas under the receiver operating characteristic and precision-recall curves of 0.91 (95% CI 0.90-0.92) and 0.30 (95% CI 0.27-0.33) for the United States hospital and of 0.85 (95% CI 0.83-0.86) and 0.06 (95% CI 0.05-0.07) for the Portuguese hospital. Heart rate, pulse oximetry, respiratory rate and systolic blood pressure were the most important predictors of ICU admission. Compared to the reference models, the models using clinical variables and the chief complaint presented higher recall for patients assigned MTS/ESI 3 and can identify patients assigned MTS/ESI 3 who are at risk for ICU admission.

DOI: https://doi.org/10.1371/journal.pone.0229331
Am J Surg. 2020 Feb 26. pii: S0002-9610(20)30091-X. [Epub ahead of print]

Opportunities for machine learning to improve surgical ward safety.

Tyler J Loftus, Patrick J Tighe, Amanda C Filiberto, Jeremy Balch, Gilbert R Upchurch, Parisa Rashidi, Azra Bihorac.

   BACKGROUND: Delayed recognition of decompensation and failure-to-rescue on surgical wards are major sources of preventable harm. This review assimilates and critically evaluates available evidence and identifies opportunities to improve surgical ward safety.
DATA SOURCES: Fifty-eight articles from Cochrane Library, EMBASE, and PubMed databases were included.
CONCLUSIONS: Only 15-20% of patients suffering ward arrest survive. In most cases, subtle signs of instability often occur prior to critical illness and arrest, and underlying pathology is reversible. Coarse risk assessments lead to under-triage of high-risk patients to wards, where surveillance for complications depends on time-consuming manual review of health records, infrequent patient assessments, prediction models that lack accuracy and autonomy, and biased, error-prone decision-making. Streaming electronic heath record data, wearable continuous monitors, and recent advances in deep learning and reinforcement learning can promote efficient and accurate risk assessments, earlier recognition of instability, and better decisions regarding diagnosis and treatment of reversible underlying pathology.

Keywords:  Cardiac arrest; Decompensation; Deterioration; Machine learning; Surgery; Ward

DOI:  https://doi.org/10.1016/j.amjsurg.2020.02.037
JCO Clin Cancer Inform. 2020 Mar;4 184-200

Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care.

Fadila Zerka, Samir Barakat, Sean Walsh, Marta Bogowicz, Ralph T H Leijenaar, Arthur Jochems, Benjamin Miraglio, David Townend, Philippe Lambin.

Big data for health care is one of the potential solutions to deal with the numerous challenges of health care, such as rising cost, aging population, precision medicine, universal health coverage, and the increase of noncommunicable diseases. However, data centralization for big data raises privacy and regulatory concerns.Covered topics include (1) an introduction to privacy of patient data and distributed learning as a potential solution to preserving these data, a description of the legal context for patient data research, and a definition of machine/deep learning concepts; (2) a presentation of the adopted review protocol; (3) a presentation of the search results; and (4) a discussion of the findings, limitations of the review, and future perspectives.Distributed learning from federated databases makes data centralization unnecessary. Distributed algorithms iteratively analyze separate databases, essentially sharing research questions and answers between databases instead of sharing the data. In other words, one can learn from separate and isolated datasets without patient data ever leaving the individual clinical institutes.Distributed learning promises great potential to facilitate big data for medical application, in particular for international consortiums. Our purpose is to review the major implementations of distributed learning in health care.

DOI: https://doi.org/10.1200/CCI.19.00047
JAMA Netw Open. 2020 Mar 02. 3(3): e200265

Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms.

and the DM DREAM Consortium

Importance: Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives.
Objective: To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms.
Design, Setting, and Participants: In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016.
Main Outcomes and Measurements: Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated.
Results: Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity.
Conclusions and Relevance: While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation.

DOI: https://doi.org/10.1001/jamanetworkopen.2020.0265
NPJ Digit Med. 2020 ;3 23

Impact of a deep learning assistant on the histopathologic classification of liver cancer.

Amirhossein Kiani, Bora Uyumazturk, Pranav Rajpurkar, Alex Wang, Rebecca Gao, Erik Jones, Yifan Yu, Curtis P Langlotz, Robyn L Ball, Thomas J Montine, Brock A Martin, Gerald J Berry, Michael G Ozawa, Florette K Hazard, Ryanne A Brown, Simon B Chen, Mona Wood, Libby S Allard, Lourdes Ylagan, Andrew Y Ng, Jeanne Shen.

  Artificial intelligence (AI) algorithms continue to rival human performance on a variety of clinical tasks, while their actual impact on human diagnosticians, when incorporated into clinical workflows, remains relatively unexplored. In this study, we developed a deep learning-based assistant to help pathologists differentiate between two subtypes of primary liver cancer, hepatocellular carcinoma and cholangiocarcinoma, on hematoxylin and eosin-stained whole-slide images (WSI), and evaluated its effect on the diagnostic performance of 11 pathologists with varying levels of expertise. Our model achieved accuracies of 0.885 on a validation set of 26 WSI, and 0.842 on an independent test set of 80 WSI. Although use of the assistant did not change the mean accuracy of the 11 pathologists (p = 0.184, OR = 1.281), it significantly improved the accuracy (p = 0.045, OR = 1.499) of a subset of nine pathologists who fell within well-defined experience levels (GI subspecialists, non-GI subspecialists, and trainees). In the assisted state, model accuracy significantly impacted the diagnostic decisions of all 11 pathologists. As expected, when the model's prediction was correct, assistance significantly improved accuracy (p = 0.000, OR = 4.289), whereas when the model's prediction was incorrect, assistance significantly decreased accuracy (p = 0.000, OR = 0.253), with both effects holding across all pathologist experience levels and case difficulty levels. Our results highlight the challenges of translating AI models into the clinical setting, and emphasize the importance of taking into account potential unintended negative consequences of model assistance when designing and testing medical AI-assistance tools.

Keywords:  Liver cancer; Machine learning; Pathology

DOI:  https://doi.org/10.1038/s41746-020-0232-8
Gastrointest Endosc. 2020 Feb 28. pii: S0016-5107(20)30209-1. [Epub ahead of print]

ACCURACY OF ARTIFICIAL INTELLIGENCE ON HISTOLOGY PREDICTION AND DETECTION OF COLORECTAL POLYPS: A SYSTEMATIC REVIEW AND META-ANALYSIS.

Thomas Kl Lui, Chuan-Guo Guo, Wai K Leung.

BACKGROUND AND AIMS: We perform a meta-analysis of all published studies to determine the diagnostic accuracy of AI on histology prediction and detection of colorectal polyps.
METHOD: We searched Embase, PubMed, Medline, Web of Science and Cochrane library databases to identify studies using AI for colorectal polyp histology prediction and detection. The quality of the included studies was measured by the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. We used a bivariate meta-analysis following a random effects model to summarize the data and plotted hierarchical summary receiver-operating characteristic (HSROC) curves. The area under the HSROC curve (AUC) served as an indicator of the diagnostic accuracy and during head-to-head comparison.
RESULT: A total of 7,680 images of colorectal polyps from 18 studies were included in the analysis of histology prediction. The accuracy of the AI (AUC) was 0.96 (95% CI, 0.95-0.98), with corresponding pooled sensitivity of 92.3% (95% CI, 88.8%-94.9%) and specificity of 89.8% (95% CI, 85.3%-93.0%). The AUC of AI using narrow-band imaging (NBI) was significantly higher than non-NBI (0.98 vs 0.84, p<0.01). The performance of AI was superior to nonexpert endoscopists (0.97 vs 0.90, p<0.01). For characterization of diminutive polyps using deep learning model with non-magnifying NBI, the pooled negative predictive value was 95.1% (95% CI, 87.7%-98.1%). For polyp detection, the pooled AUC was 0.90 (95% CI, 0.67-1.00) with sensitivity of 95.0% (95% CI, 91.0%-97.0%) and specificity of 88.0% (95% CI, 58.0%-99.0%).
CONCLUSION: AI was accurate in histology prediction and detection of colorectal polyps, including diminutive polyps. The performance of AI was better under NBI and was superior to non-expert endoscopists. Despite the difference in AI models and study designs, the AI performances are rather consistent, which could serve a reference for future AI studies.

DOI: https://doi.org/10.1016/j.gie.2020.02.033
Int J Med Sci. 2020 ;17(3): 280-291

Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer.

Wen-Chien Ting, Yen-Chiao Angel Lu, Wei-Chi Ho, Chalong Cheewakriangkrai, Horng-Rong Chang, Chia-Ling Lin.

   BACKGROUND: Colorectal cancer (CRC) is the third commonly diagnosed cancer worldwide. Recurrence of CRC (Re) and onset of a second primary malignancy (SPM) are important indicators in treating CRC, but it is often difficult to predict the onset of a SPM. Therefore, we used mechanical learning to identify risk factors that affect Re and SPM.
PATIENT AND METHODS: CRC patients with cancer registry database at three medical centers were identified. All patients were classified based on Re or no recurrence (NRe) as well as SPM or no SPM (NSPM). Two classifiers, namely A Library for Support Vector Machines (LIBSVM) and Reduced Error Pruning Tree (REPTree), were applied to analyze the relationship between clinical features and Re and/or SPM category by constructing optimized models.
RESULTS: When Re and SPM were evaluated separately, the accuracy of LIBSVM was 0.878 and that of REPTree was 0.622. When Re and SPM were evaluated in combination, the precision of models for SPM+Re, NSPM+Re, SPM+NRe, and NSPM+NRe was 0.878, 0.662, 0.774, and 0.778, respectively.
CONCLUSIONS: Machine learning can be used to rank factors affecting tumor Re and SPM. In clinical practice, routine checkups are necessary to ensure early detection of new tumors. The success of prediction and early detection may be enhanced in the future by applying "big data" analysis methods such as machine learning.

Keywords:  colorectal cancer; machine learning; second primary malignancy

DOI:  https://doi.org/10.7150/ijms.37134
Ann Oncol. 2018 Oct;pii: S0923-7534(19)50324-6. [Epub ahead of print]29 Suppl 8 viii667

Streamlining multi-omic and artificial intelligence analysis through interrogative biology and baicis for translational precision medicine applications in clinical oncology.

L Rodrigues, M A Kiebish, G Miller, L Zhang, V Vemulapalli, V K Vishnudas, R Sarangarajan, N R Narain, V R Akmaev.

DOI: https://doi.org/10.1093/annonc/mdy303.056
Ann Oncol. 2019 Apr;pii: S0923-7534(19)30153-X. [Epub ahead of print]30 Suppl 2 ii74

Implementation of artificial intelligence (AI) for lung cancer clinical trial matching in a tertiary cancer center.

K Leventakos, J Helgeson, A S Mansfield, E Deering, A Schwecke, A Adjei, J Molina, C Hocum, T Halfdanarson, R Marks, K Parikh, K Pomerleau, S Coverdill, M Rammage, T Haddad.

DOI: https://doi.org/10.1093/annonc/mdz065
J Med Internet Res. 2020 Feb 20. 22(2): e14122

Concordance Between Watson for Oncology and a Multidisciplinary Clinical Decision-Making Team for Gastric Cancer and the Prognostic Implications: Retrospective Study.

Yulong Tian, Xiaodong Liu, Zixuan Wang, Shougen Cao, Zimin Liu, Qinglian Ji, Zequn Li, Yuqi Sun, Xin Zhou, Daosheng Wang, Yanbing Zhou.

   BACKGROUND: With the increasing number of cancer treatments, the emergence of multidisciplinary teams (MDTs) provides patients with personalized treatment options. In recent years, artificial intelligence (AI) has developed rapidly in the medical field. There has been a gradual tendency to replace traditional diagnosis and treatment with AI. IBM Watson for Oncology (WFO) has been proven to be useful for decision-making in breast cancer and lung cancer, but to date, research on gastric cancer is limited.
OBJECTIVE: This study compared the concordance of WFO with MDT and investigated the impact on patient prognosis.
METHODS: This study retrospectively analyzed eligible patients (N=235) with gastric cancer who were evaluated by an MDT, received corresponding recommended treatment, and underwent follow-up. Thereafter, physicians inputted the information of all patients into WFO manually, and the results were compared with the treatment programs recommended by the MDT. If the MDT treatment program was classified as "recommended" or "considered" by WFO, we considered the results concordant. All patients were divided into a concordant group and a nonconcordant group according to whether the WFO and MDT treatment programs were concordant. The prognoses of the two groups were analyzed.
RESULTS: The overall concordance of WFO and the MDT was 54.5% (128/235) in this study. The subgroup analysis found that concordance was less likely in patients with human epidermal growth factor receptor 2 (HER2)-positive tumors than in patients with HER2-negative tumors (P=.02). Age, Eastern Cooperative Oncology Group performance status, differentiation type, and clinical stage were not found to affect concordance. Among all patients, the survival time was significantly better in concordant patients than in nonconcordant patients (P<.001). Multivariate analysis revealed that concordance was an independent prognostic factor of overall survival in patients with gastric cancer (hazard ratio 0.312 [95% CI 0.187-0.521]).
CONCLUSIONS: The treatment recommendations made by WFO and the MDT were mostly concordant in gastric cancer patients. If the WFO options are updated to include local treatment programs, the concordance will greatly improve. The HER2 status of patients with gastric cancer had a strong effect on the likelihood of concordance. Generally, survival was better in concordant patients than in nonconcordant patients.

Keywords:  Watson for Oncology; artificial intelligence; concordance; gastric cancer; multidisciplinary team

DOI:  https://doi.org/10.2196/14122
Gastroenterology. 2020 Feb 28. pii: S0016-5085(20)30263-8. [Epub ahead of print]

Improved Accuracy in Optical Diagnosis of Colorectal Polyps Using Convolutional Neural Networks with Visual Explanations.

Eun Hyo Jin, Dongheon Lee, Jung Ho Bae, Hae Yeon Kang, Min-Sun Kwak, Ji Yeon Seo, Jong In Yang, Sun Young Yang, Seon Hee Lim, Jeong Yoon Yim, Joo Hyun Lim, Goh Eun Chung, Su Jin Chung, Ji Min Choi, Yoo Min Han, Seung Joo Kang, Jooyoung Lee, Hee Chan Kim, Joo Sung Kim.

   BACKGROUND & AIMS: Narrow-band imaging (NBI) can be used to determine whether colorectal polyps are adenomatous or hyperplastic. We investigated whether an artificial intelligence (AI) system can increase the accuracy of characterizations of polyps by endoscopists of different skill levels.
METHODS: We developed convolutional neural networks (CNNs) for evaluation of diminutive colorectal polyps, based on efficient neural architecture searches via parameter sharing with augmentation using narrow-band images of diminutive (≤5 mm) polyps, collected from October 2015 through October 2017 at the Seoul National University Hospital, Healthcare System Gangnam Center (training set). We trained the CNN using images from 1100 adenomatous polyps and 1050 hyperplastic polyps from 1379 patients. We then tested the system using 300 images of 180 adenomatous polyps and 120 hyperplastic polyps, obtained from January 2018 to May 2019. We compared the accuracy of 22 endoscopists of different skill levels (7 novices, 4 experts, and 11 NBI-trained experts) vs the CNN in evaluation of images (adenomatous vs hyperplastic) from 180 adenomatous and 120 hyperplastic polyps. The endoscopists then evaluated the polyp images with knowledge of the CNN-processed results. We conducted mixed-effect logistic and linear regression analyses to determine the effects of AI assistance on the accuracy of analysis of diminutive colorectal polyps by endoscopists (primary outcome).
RESULTS: The CNN distinguished adenomatous vs hyperplastic diminutive polyps with 86.7% accuracy, based on histologic analysis as the reference standard. Endoscopists distinguished adenomatous vs hyperplastic diminutive polyps with 82.5% overall accuracy (novices, 73.8% accuracy; experts, 83.8% accuracy; and NBI-trained experts, 87.6% accuracy). With knowledge of the CNN-processed results, the overall accuracy of the endoscopists increased to 88.5% (P<.05). With knowledge of the CNN-processed results, the accuracy of novice endoscopists increased to 85.6% (P<.05). The CNN-processed results significantly reduced endoscopist time of diagnosis (from 3.92 to 3.37 seconds per polyp, P=.042).
CONCLUSIONS: We developed a CNN that significantly increases the accuracy of evaluation of diminutive colorectal polyps (as adenomatous vs hyperplastic) and reduces the time of diagnosis by endoscopists. This AI assistance system significantly increased the accuracy of analysis by novice endoscopists, who achieved near-expert levels of accuracy without extra training. The CNN assistance system can reduce the skill-level dependence of endoscopists and costs.

Keywords:  cancer screening; colorectal cancer; deep learning; diagnostic

DOI:  https://doi.org/10.1053/j.gastro.2020.02.036
J Med Radiat Sci. 2020 Mar 05.

Artificial intelligence and convolution neural networks assessing mammographic images: a narrative literature review.

Dennis Jay Wong, Ziba Gandomkar, Wan-Jing Wu, Guijing Zhang, Wushuang Gao, Xiaoying He, Yunuo Wang, Warren Reed.

  Studies have shown that the use of artificial intelligence can reduce errors in medical image assessment. The diagnosis of breast cancer is an essential task; however, diagnosis can include 'detection' and 'interpretation' errors. Studies to reduce these errors have shown the feasibility of using convolution neural networks (CNNs). This narrative review presents recent studies in diagnosing mammographic malignancy investigating the accuracy and reliability of these CNNs. Databases including ScienceDirect, PubMed, MEDLINE, British Medical Journal and Medscape were searched using the terms 'convolutional neural network or artificial intelligence', 'breast neoplasms [MeSH] or breast cancer or breast carcinoma' and 'mammography [MeSH Terms]'. Articles collected were screened under the inclusion and exclusion criteria, accounting for the publication date and exclusive use of mammography images, and included only literature in English. After extracting data, results were compared and discussed. This review included 33 studies and identified four recurring categories of studies: the differentiation of benign and malignant masses, the localisation of masses, cancer-containing and cancer-free breast tissue differentiation and breast classification based on breast density. CNN's application in detecting malignancy in mammography appears promising but requires further standardised investigations before potentially becoming an integral part of the diagnostic routine in mammography.

Keywords:  Artificial intelligence; breast cancer; breast density; convolutional neural network; mammography

DOI:  https://doi.org/10.1002/jmrs.385
J Clin Med. 2020 Mar 03. pii: E678. [Epub ahead of print]9(3):

Artificial Intelligence in Acute Kidney Injury Risk Prediction.

Joana Gameiro, Tiago Branco, José António Lopes.

  Acute kidney injury (AKI) is a frequent complication in hospitalized patients, which is associated with worse short and long-term outcomes. It is crucial to develop methods to identify patients at risk for AKI and to diagnose subclinical AKI in order to improve patient outcomes. The advances in clinical informatics and the increasing availability of electronic medical records have allowed for the development of artificial intelligence predictive models of risk estimation in AKI. In this review, we discussed the progress of AKI risk prediction from risk scores to electronic alerts to machine learning methods.

Keywords:  acute kidney injury; artificial intelligence; risk prediction

DOI:  https://doi.org/10.3390/jcm9030678
AJR Am J Roentgenol. 2020 Mar 04. 1-7

Comparison of Artificial Intelligence-Based Fully Automatic Chest CT Emphysema Quantification to Pulmonary Function Testing.

Andreas M Fischer, Akos Varga-Szemes, Marly van Assen, L Parkwood Griffith, Pooyan Sahbaee, Jonathan I Sperl, John W Nance, U Joseph Schoepf.

  OBJECTIVE. The purpose of this study was to evaluate an artificial intelligence (AI)-based prototype algorithm for fully automated quantification of emphysema on chest CT compared with pulmonary function testing (spirometry). MATERIALS AND METHODS. A total of 141 patients (72 women, mean age ± SD of 66.46 ± 9.7 years [range, 23-86 years]; 69 men, mean age of 66.72 ± 11.4 years [range, 27-91 years]) who underwent both chest CT acquisition and spirometry within 6 months were retrospectively included. The spirometry-based Tiffeneau index (TI; calculated as the ratio of forced expiratory volume in the first second to forced vital capacity) was used to measure emphysema severity; a value less than 0.7 was considered to indicate airway obstruction. Segmentation of the lung based on two different reconstruction methods was carried out by using a deep convolution image-to-image network. This multilayer convolutional neural network was combined with multilevel feature chaining and depth monitoring. To discriminate the output of the network from ground truth, an adversarial network was used during training. Emphysema was quantified using spatial filtering and attenuation-based thresholds. Emphysema quantification and TI were compared using the Spearman correlation coefficient. RESULTS. The mean TI for all patients was 0.57 ± 0.13. The mean percentages of emphysema using reconstruction methods 1 and 2 were 9.96% ± 11.87% and 8.04% ± 10.32%, respectively. AI-based emphysema quantification showed very strong correlation with TI (reconstruction method 1, ρ = -0.86; reconstruction method 2, ρ = -0.85; both p < 0.0001), indicating that AI-based emphysema quantification meaningfully reflects clinical pulmonary physiology. CONCLUSION. AI-based, fully automated emphysema quantification shows good correlation with TI, potentially contributing to an image-based diagnosis and quantification of emphysema severity.

Keywords:  CT; artificial intelligence; chronic obstructive pulmonary disease; emphysema quantification; lung function values

DOI:  https://doi.org/10.2214/AJR.19.21572
Front Psychiatry. 2020 ;11 16

Identifying Schizophrenia Using Structural MRI With a Deep Learning Algorithm.

Jihoon Oh, Baek-Lok Oh, Kyong-Uk Lee, Jeong-Ho Chae, Kyongsik Yun.

   Objective: Although distinctive structural abnormalities occur in patients with schizophrenia, detecting schizophrenia with magnetic resonance imaging (MRI) remains challenging. This study aimed to detect schizophrenia in structural MRI data sets using a trained deep learning algorithm.
Method: Five public MRI data sets (BrainGluSchi, COBRE, MCICShare, NMorphCH, and NUSDAST) from schizophrenia patients and normal subjects, for a total of 873 structural MRI data sets, were used to train a deep convolutional neural network.
Results: The deep learning algorithm trained with structural MR images detected schizophrenia in randomly selected images with reliable performance (area under the receiver operating characteristic curve [AUC] of 0.96). The algorithm could also identify MR images from schizophrenia patients in a previously unencountered data set with an AUC of 0.71 to 0.90. The deep learning algorithm's classification performance degraded to an AUC of 0.71 when a new data set with younger patients and a shorter duration of illness than the training data sets was presented. The brain region contributing the most to the performance of the algorithm was the right temporal area, followed by the right parietal area. Semitrained clinical specialists hardly discriminated schizophrenia patients from healthy controls (AUC: 0.61) in the set of 100 randomly selected brain images.
Conclusions: The deep learning algorithm showed good performance in detecting schizophrenia and identified relevant structural features from structural brain MRI data; it had an acceptable classification performance in a separate group of patients at an earlier stage of the disease. Deep learning can be used to delineate the structural characteristics of schizophrenia and to provide supplementary diagnostic information in clinical settings.

Keywords:  MRI; classification; deep learning; schizophrenia; structural abnormalities

DOI:  https://doi.org/10.3389/fpsyt.2020.00016
Medicine (Baltimore). 2020 Feb;99(9): e19239

Detection of extremity chronic traumatic osteomyelitis by machine learning based on computed-tomography images: A retrospective study.

Yifan Wu, Xin Lu, Jianqiao Hong, Weijie Lin, Shiming Chen, Shenghong Mou, Gang Feng, Ruijian Yan, Zhiyuan Cheng.

Despite the availability of a series of tests, detection of chronic traumatic osteomyelitis is still exhausting in clinical practice. We hypothesized that machine learning based on computed-tomography (CT) images would provide better diagnostic performance for extremity traumatic chronic osteomyelitis than the serological biomarker alone. A retrospective study was carried out to collect medical data from patients with extremity traumatic osteomyelitis according to the criteria of musculoskeletal infection society. In each patient, serum levels of C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), and D-dimer were measured and CT scan of the extremity was conducted 7 days after admission preoperatively. A deep residual network (ResNet) machine learning model was established for recognition of bone lesion on the CT image. A total of 28,718 CT images from 163 adult patients were included. Then, we randomly extracted 80% of all CT images from each patient for training, 10% for validation, and 10% for testing. Our results showed that machine learning (83.4%) outperformed CRP (53.2%), ESR (68.8%), and D-dimer (68.1%) separately in accuracy. Meanwhile, machine learning (88.0%) demonstrated highest sensitivity when compared with CRP (50.6%), ESR (73.0%), and D-dimer (51.7%). Considering the specificity, machine learning (77.0%) is better than CRP (59.4%) and ESR (62.2%), but not D-dimer (83.8%). Our findings indicated that machine learning based on CT images is an effective and promising avenue for detection of chronic traumatic osteomyelitis in the extremity.

DOI: https://doi.org/10.1097/MD.0000000000019239