bims-arihec Biomed News
on Artificial Intelligence in Healthcare
Issue of 2019‒11‒24
sixteen papers selected by
Céline Bélanger
Cogniges Inc.


  1. J Crit Care. 2019 Oct 22. pii: S0883-9441(19)30153-4. [Epub ahead of print]55 163-170
    Kim J, Chang H, Kim D, Jang DH, Park I, Kim K.
      BACKGROUND: We hypothesized utilizing machine learning (ML) algorithms for screening septic shock in ED would provide better accuracy than qSOFA or MEWS.METHODS: The study population was adult (≥20 years) patients visiting ED for suspected infection. Target event was septic shock within 24 h after arrival. Demographics, vital signs, level of consciousness, chief complaints (CC) and initial blood test results were used as predictors. CC were embedded into 16-dimensional vector space using singular value decomposition. Six base learners including support vector machine, gradient-boosting machine, random forest, multivariate adaptive regression splines and least absolute shrinkage and selection operator and ridge regression and their ensembles were tested. We also trained and tested MLP networks with various setting.
    RESULTS: A total of 49,560 patients were included and 4817 (9.7%) had septic shock within 24 h. All ML classifiers significantly outperformed qSOFA score, MEWS and their age-sex adjusted versions with their AUROC ranging from 0.883 to 0.929. The ensembles of the base classifiers showed the best performance and addition of CC embedding was associated with statistically significant increases in performance.
    CONCLUSIONS: ML classifiers significantly outperforms clinical scores in screening septic shock at ED triage.
    Keywords:  Clinical decision support tool; Diagnosis; Emergency department triage tool; Machine learning; Prediction; Sepsis; Septic shock
    DOI:  https://doi.org/10.1016/j.jcrc.2019.09.024
  2. Gastrointest Endosc. 2019 Nov 16. pii: S0016-5107(19)32428-9. [Epub ahead of print]
    Klang E, Barash Y, Yehuda Margalit R, Soffer S, Shimon O, Albshesh A, Ben-Horin S, Michal Amitai M, Eliakim R, Kopylov U.
      BACKGROUND AND AIM: The aim of our study was to develop and evaluate a deep learning algorithm for the automated detection of small-bowel ulcers in Crohn's disease (CD) on capsule endoscopy )CE( images of individual patients.METHODS: We retrospectively collected CE images of known CD patients and controls. Each image was labeled by an expert gastroenterologist as either normal mucosa or containing mucosal ulcers. A convolutional neural network (CNN) was trained to classify images into either normal mucosa or mucosal ulcers. First, we trained the network on 5-fold randomly split images (each fold with 80% training images and 20% images testing). Then we conducted 10 experiments in which images from n-1 patients were used to train a network and images from a different individual patient were used to test the network. Results of the networks were compared for randomly split images and for individual patients. Area under the curves (AUCs) and accuracies were computed for each individual network.
    RESULTS: Overall, our dataset included 17,640 CE images from 49 patients; 7,391 images with mucosal ulcers and 10,249 images of normal mucosa. For randomly split images results were excellent with AUCs of 0.99 and accuracies ranging from 95.4% to 96.7%. For individual patient-level experiments, the AUCs were also excellent (0.94 to 0.99).
    CONCLUSIONS: Deep learning technology provides accurate and fast automated detection of mucosal ulcers on CE images. Individual patient-level analysis provided high and consistent diagnostic accuracy with shortened reading time; in the future deep learning algorithms may augment and facilitate CE reading.
    Keywords:  AI (Artificial Intelligence); Capsule Endoscopy; Crohn's Disease; Neural Networks; Ulcer
    DOI:  https://doi.org/10.1016/j.gie.2019.11.012
  3. Breast. 2019 Nov 06. pii: S0960-9776(19)30592-2. [Epub ahead of print]49 74-80
    Tagliafico AS, Piana M, Schenone D, Lai R, Massone AM, Houssami N.
      Diagnosis of early invasive breast cancer relies on radiology and clinical evaluation, supplemented by biopsy confirmation. At least three issues burden this approach: a) suboptimal sensitivity and suboptimal positive predictive power of radiology screening and diagnostic approaches, respectively; b) invasiveness of biopsy with discomfort for women undergoing diagnostic tests; c) long turnaround time for recall tests. In the screening setting, radiology sensitivity is suboptimal, and when a suspicious lesion is detected and a biopsy is recommended, the positive predictive value of radiology is modest. Recent technological advances in medical imaging, especially in the field of artificial intelligence applied to image analysis, hold promise in addressing clinical challenges in cancer detection, assessment of treatment response, and monitoring disease progression. Radiomics include feature extraction from clinical images; these features are related to tumor size, shape, intensity, and texture, collectively providing comprehensive tumor characterization, the so-called radiomics signature of the tumor. Radiomics is based on the hypothesis that extracted quantitative data derives from mechanisms occurring at genetic and molecular levels. In this article we focus on the role and potential of radiomics in breast cancer diagnosis and prognostication.
    Keywords:  Artificial intelligence; Breast cancer; Digital breast tomosynthesis; Magnetic resonance imaging; Prediction; Radiomics
    DOI:  https://doi.org/10.1016/j.breast.2019.10.018
  4. Radiology. 2019 Nov 19. 190372
    Zhou LQ, Wu XL, Huang SY, Wu GG, Ye HR, Wei Q, Bao LY, Deng YB, Li XR, Cui XW, Dietrich CF.
      Background Deep learning (DL) algorithms are gaining extensive attention for their excellent performance in image recognition tasks. DL models can automatically make a quantitative assessment of complex medical image characteristics and achieve increased accuracy in diagnosis with higher efficiency. Purpose To determine the feasibility of using a DL approach to predict clinically negative axillary lymph node metastasis from US images in patients with primary breast cancer. Materials and Methods A data set of US images in patients with primary breast cancer with clinically negative axillary lymph nodes from Tongji Hospital (974 imaging studies from 2016 to 2018, 756 patients) and an independent test set from Hubei Cancer Hospital (81 imaging studies from 2018 to 2019, 78 patients) were collected. Axillary lymph node status was confirmed with pathologic examination. Three different convolutional neural networks (CNNs) of Inception V3, Inception-ResNet V2, and ResNet-101 architectures were trained on 90% of the Tongji Hospital data set and tested on the remaining 10%, as well as on the independent test set. The performance of the models was compared with that of five radiologists. The models' performance was analyzed in terms of accuracy, sensitivity, specificity, receiver operating characteristic curves, areas under the receiver operating characteristic curve (AUCs), and heat maps. Results The best-performing CNN model, Inception V3, achieved an AUC of 0.89 (95% confidence interval [CI]: 0.83, 0.95) in the prediction of the final clinical diagnosis of axillary lymph node metastasis in the independent test set. The model achieved 85% sensitivity (35 of 41 images; 95% CI: 70%, 94%) and 73% specificity (29 of 40 images; 95% CI: 56%, 85%), and the radiologists achieved 73% sensitivity (30 of 41 images; 95% CI: 57%, 85%; P = .17) and 63% specificity (25 of 40 images; 95% CI: 46%, 77%; P = .34). Conclusion Using US images from patients with primary breast cancer, deep learning models can effectively predict clinically negative axillary lymph node metastasis. Artificial intelligence may provide an early diagnostic strategy for lymph node metastasis in patients with breast cancer with clinically negative lymph nodes. Published under a CC BY 4.0 license. Online supplemental material is available for this article. See also the editorial by Bae in this issue.
    DOI:  https://doi.org/10.1148/radiol.2019190372
  5. Br J Radiol. 2019 Nov 19. 20190580
    Chan HP, Samala RK, Hadjiiski LM.
      Computer-aided diagnosis (CAD) has been a popular area of research and development in the past few decades. In CAD, machine learning methods and multidisciplinary knowledge and techniques are used to analyze the patient information and the results can be used to assist clinicians in their decision making process. CAD may analyze imaging information alone or in combination with other clinical data. It may provide the analyzed information directly to the clinician or correlate the analyzed results with the likelihood of certain diseases based on statistical modeling of the past cases in the population. CAD systems can be developed to provide decision support for many applications in the patient care processes, such as lesion detection, characterization, cancer staging, treatment planning and response assessment, recurrence and prognosis prediction. The new state-of-the-art machine learning technique, known as deep learning (DL), has revolutionized speech and text recognition as well as computer vision. The potential of major breakthrough by DL in medical image analysis and other CAD applications for patient care has brought about unprecedented excitement of applying CAD, or artificial intelligence (AI), to medicine in general and to radiology in particular. In this paper, we will provide an overview of the recent developments of CAD using DL in breast imaging and discuss some challenges and practical issues that may impact the advancement of AI and its integration into clinical workflow.
    DOI:  https://doi.org/10.1259/bjr.20190580
  6. Neurosurgery. 2019 Nov 21. pii: nyz471. [Epub ahead of print]
    Panesar SS, Kliot M, Parrish R, Fernandez-Miranda J, Cagle Y, Britz GW.
      Artificial intelligence (AI)-facilitated clinical automation is expected to become increasingly prevalent in the near future. AI techniques may permit rapid and detailed analysis of the large quantities of clinical data generated in modern healthcare settings, at a level that is otherwise impossible by humans. Subsequently, AI may enhance clinical practice by pushing the limits of diagnostics, clinical decision making, and prognostication. Moreover, if combined with surgical robotics and other surgical adjuncts such as image guidance, AI may find its way into the operating room and permit more accurate interventions, with fewer errors. Despite the considerable hype surrounding the impending medical AI revolution, little has been written about potential downsides to increasing clinical automation. These may include both direct and indirect consequences. Directly, faulty, inadequately trained, or poorly understood algorithms may produce erroneous results, which may have wide-scale impact. Indirectly, increasing use of automation may exacerbate de-skilling of human physicians due to over-reliance, poor understanding, overconfidence, and lack of necessary vigilance of an automated clinical workflow. Many of these negative phenomena have already been witnessed in other industries that have already undergone, or are undergoing "automation revolutions," namely commercial aviation and the automotive industry. This narrative review explores the potential benefits and consequences of the anticipated medical AI revolution from a neurosurgical perspective.
    Keywords:  Artificial intelligence; Automation; Deep learning; Diagnostics; Machine learning; Prognostication; Surgical adjuncts
    DOI:  https://doi.org/10.1093/neuros/nyz471
  7. NPJ Digit Med. 2019 ;2 111
    Patel BN, Rosenberg L, Willcox G, Baltaxe D, Lyons M, Irvin J, Rajpurkar P, Amrhein T, Gupta R, Halabi S, Langlotz C, Lo E, Mammarappallil J, Mariano AJ, Riley G, Seekins J, Shen L, Zucker E, Lungren M.
      Human-in-the-loop (HITL) AI may enable an ideal symbiosis of human experts and AI models, harnessing the advantages of both while at the same time overcoming their respective limitations. The purpose of this study was to investigate a novel collective intelligence technology designed to amplify the diagnostic accuracy of networked human groups by forming real-time systems modeled on biological swarms. Using small groups of radiologists, the swarm-based technology was applied to the diagnosis of pneumonia on chest radiographs and compared against human experts alone, as well as two state-of-the-art deep learning AI models. Our work demonstrates that both the swarm-based technology and deep-learning technology achieved superior diagnostic accuracy than the human experts alone. Our work further demonstrates that when used in combination, the swarm-based technology and deep-learning technology outperformed either method alone. The superior diagnostic accuracy of the combined HITL AI solution compared to radiologists and AI alone has broad implications for the surging clinical AI deployment and implementation strategies in future practice.
    Keywords:  Computer science; Radiography
    DOI:  https://doi.org/10.1038/s41746-019-0189-7
  8. Genome Med. 2019 Nov 19. 11(1): 70
    Dias R, Torkamani A.
      Artificial intelligence (AI) is the development of computer systems that are able to perform tasks that normally require human intelligence. Advances in AI software and hardware, especially deep learning algorithms and the graphics processing units (GPUs) that power their training, have led to a recent and rapidly increasing interest in medical AI applications. In clinical diagnostics, AI-based computer vision approaches are poised to revolutionize image-based diagnostics, while other AI subtypes have begun to show similar promise in various diagnostic modalities. In some areas, such as clinical genomics, a specific type of AI algorithm known as deep learning is used to process large and complex genomic datasets. In this review, we first summarize the main classes of problems that AI systems are well suited to solve and describe the clinical diagnostic tasks that benefit from these solutions. Next, we focus on emerging methods for specific tasks in clinical genomics, including variant calling, genome annotation and variant classification, and phenotype-to-genotype correspondence. Finally, we end with a discussion on the future potential of AI in individualized medicine applications, especially for risk prediction in common complex diseases, and the challenges, limitations, and biases that must be carefully addressed for the successful deployment of AI in medical applications, particularly those utilizing human genetics and genomics data.
    DOI:  https://doi.org/10.1186/s13073-019-0689-8
  9. Cardiovasc Diagn Ther. 2019 Oct;9(Suppl 2): S310-S325
    Arafati A, Hu P, Finn JP, Rickers C, Cheng AL, Jafarkhani H, Kheradvar A.
      Cardiac MRI (CMR) allows non-invasive, non-ionizing assessment of cardiac function and anatomy in patients with congenital heart disease (CHD). The utility of CMR as a non-invasive imaging tool for evaluation of CHD have been growing exponentially over the past decade. The algorithms based on artificial intelligence (AI), and in particular, deep learning, have rapidly become a methodology of choice for analyzing CMR. A wide range of applications for AI have been developed to tackle challenges in various aspects of CMR, and significant advances have also been made from image acquisition to image analysis and diagnosis. We include an overview of AI definitions, different architectures, and details on well-known methods. This paper reviews the major deep learning concepts used for analyses of patients with CHD. In the end, we have summarized a list of open challenges and concerns to be considered for future studies.
    Keywords:  Cardiac MRI (CMR); artificial intelligence (AI); cardiac segmentation; congenital heart disease (CHD); deep learning
    DOI:  https://doi.org/10.21037/cdt.2019.06.09
  10. J Clin Med. 2019 Nov 14. pii: E1976. [Epub ahead of print]8(11):
    Nguyen DT, Pham TD, Batchuluun G, Yoon HS, Park KR.
      Image-based computer-aided diagnosis (CAD) systems have been developed to assist doctors in the diagnosis of thyroid cancer using ultrasound thyroid images. However, the performance of these systems is strongly dependent on the selection of detection and classification methods. Although there are previous researches on this topic, there is still room for enhancement of the classification accuracy of the existing methods. To address this issue, we propose an artificial intelligence-based method for enhancing the performance of the thyroid nodule classification system. Thus, we extract image features from ultrasound thyroid images in two domains: spatial domain based on deep learning, and frequency domain based on Fast Fourier transform (FFT). Using the extracted features, we perform a cascade classifier scheme for classifying the input thyroid images into either benign (negative) or malign (positive) cases. Through expensive experiments using a public dataset, the thyroid digital image database (TDID) dataset, we show that our proposed method outperforms the state-of-the-art methods and produces up-to-date classification results for the thyroid nodule classification problem.
    Keywords:  Fast Fourier transform; artificial intelligence; deep learning; frequency domain; spatial domain; thyroid nodule classification
    DOI:  https://doi.org/10.3390/jcm8111976
  11. Eur Neurol. 2019 Nov 19. 1-24
    Raghavendra U, Acharya UR, Adeli H.
      BACKGROUND: Authors have been advocating the research ideology that a computer-aided diagnosis (CAD) system trained using lots of patient data and physiological signals and images based on adroit integration of advanced signal processing and artificial intelligence (AI)/machine learning techniques in an automated fashion can assist neurologists, neurosurgeons, radiologists, and other medical providers to make better clinical decisions.SUMMARY: This paper presents a state-of-the-art review of research on automated diagnosis of 5 neurological disorders in the past 2 decades using AI techniques: epilepsy, Parkinson's disease, Alzheimer's disease, multiple sclerosis, and ischemic brain stroke using physiological signals and images. Recent research articles on different feature extraction methods, dimensionality reduction techniques, feature selection, and classification techniques are reviewed. Key Message: CAD systems using AI and advanced signal processing techniques can assist clinicians in analyzing and interpreting physiological signals and images more effectively.
    Keywords:  Classification algorithm; Computer-aided diagnosis; Machine learning; Neurological disorder
    DOI:  https://doi.org/10.1159/000504292
  12. J Clin Sleep Med. 2019 Nov 15. 15(11): 1599-1608
    Stretch R, Ryden A, Fung CH, Martires J, Liu S, Balasubramanian V, Saedi B, Hwang D, Martin JL, Della Penna N, Zeidler MR.
      STUDY OBJECTIVES: Home sleep apnea testing (HSAT) is an efficient and cost-effective method of diagnosing obstructive sleep apnea (OSA). However, nondiagnostic HSAT necessitates additional tests that erode these benefits, delaying diagnoses and increasing costs. Our objective was to optimize this diagnostic pathway by using predictive modeling to identify patients who should be referred directly to polysomnography (PSG) due to their high probability of nondiagnostic HSAT.METHODS: HSAT performed as the initial test for suspected OSA within the Veterans Administration Greater Los Angeles Healthcare System was analyzed retrospectively. Data were extracted from pre-HSAT questionnaires and the medical record. Tests were diagnostic if there was a respiratory event index (REI) ≥ 5 events/h. Tests with REI < 5 events/h or technical inadequacy-two outcomes requiring additional testing with PSG-were considered nondiagnostic. Standard logistic regression models were compared with models trained using machine learning techniques.
    RESULTS: Models were trained using 80% of available data and validated on the remaining 20%. Performance was evaluated using partial area under the precision-recall curve (pAUPRC). Machine learning techniques consistently yielded higher pAUPRC than standard logistic regression, which had pAUPRC of 0.574. The random forest model outperformed all other models (pAUPRC 0.862). Preferred calibration of this model yielded the following: sensitivity 0.46, specificity 0.95, positive predictive value 0.81, negative predictive value 0.80.
    CONCLUSIONS: Compared with standard logistic regression models, machine learning models improve prediction of patients requiring in-laboratory PSG. These models could be implemented into a clinical decision support tool to help clinicians select the optimal test to diagnose OSA.
    Keywords:  home sleep apnea testing; machine learning; obstructive sleep apnea; predictive model
    DOI:  https://doi.org/10.5664/jcsm.8020
  13. Eur J Radiol. 2019 Nov 09. pii: S0720-048X(19)30392-4. [Epub ahead of print]121 108742
    van Hoek J, Huber A, Leichtle A, Härmä K, Hilt D, von Tengg-Kobligk H, Heverhagen J, Poellinger A.
      PURPOSE: To evaluate the opinion and assessment of radiologists, surgeons and medical students on a number of important topics regarding the future of radiology, such as artificial intelligence (AI), turf battles, teleradiology and 3D-printing.METHOD: An online questionnaire was created using the SurveyMonkey platform targeting radiologists, students and surgeons throughout the German speaking part of Switzerland. A total of 170 people participated in the survey (59 radiologists, 56 surgeons and 55 students). Statistical analysis was carried out using the Kruskal-Wallis test with Dunn's multiple comparison post-hoc tests.
    RESULTS: While the majority of participants agreed that AI should be included as a support system in radiology (Likert scale 0-10: Median value 8), surgeons were less supportive than radiologists (p = 0.001). Students saw a potential threat of AI as more likely than radiologists did (p = 0.041). When asked whether they were concerned about "turf losses" from radiology to other disciplines, radiologists were much more likely to agree than students (p < 0.001). Of the students that do not intend to specialize in radiology, 26 % stated that AI was one of the reasons. Surgeons advocate the use of teleradiology.
    CONCLUSIONS: With regard to AI, radiologists expect their workflow to become more efficient and tend to support the use of AI, whereas medical students and surgeons tend to be more skeptical towards this technology. Medical students see AI as a potential threat to diagnostic radiologists, while radiologists themselves are rather afraid of turf losses.
    Keywords:  Medical – surveys - questionnaires; Radiology – artificial intelligence - students
    DOI:  https://doi.org/10.1016/j.ejrad.2019.108742
  14. BMC Med Inform Decis Mak. 2019 Nov 21. 19(1): 231
    Kang MJ, Kim SY, Na DL, Kim BC, Yang DW, Kim EJ, Na HR, Han HJ, Lee JH, Kim JH, Park KH, Park KW, Han SH, Kim SY, Yoon SJ, Yoon B, Seo SW, Moon SY, Yang Y, Shim YS, Baek MJ, Jeong JH, Choi SH, Youn YC.
      BACKGROUND: Neuropsychological tests (NPTs) are important tools for informing diagnoses of cognitive impairment (CI). However, interpreting NPTs requires specialists and is thus time-consuming. To streamline the application of NPTs in clinical settings, we developed and evaluated the accuracy of a machine learning algorithm using multi-center NPT data.METHODS: Multi-center data were obtained from 14,926 formal neuropsychological assessments (Seoul Neuropsychological Screening Battery), which were classified into normal cognition (NC), mild cognitive impairment (MCI) and Alzheimer's disease dementia (ADD). We trained a machine learning model with artificial neural network algorithm using TensorFlow (https://www.tensorflow.org) to distinguish cognitive state with the 46-variable data and measured prediction accuracies from 10 randomly selected datasets. The features of the NPT were listed in order of their contribution to the outcome using Recursive Feature Elimination.
    RESULTS: The ten times mean accuracies of identifying CI (MCI and ADD) achieved by 96.66 ± 0.52% of the balanced dataset and 97.23 ± 0.32% of the clinic-based dataset, and the accuracies for predicting cognitive states (NC, MCI or ADD) were 95.49 ± 0.53 and 96.34 ± 1.03%. The sensitivity to the detection CI and MCI in the balanced dataset were 96.0 and 96.0%, and the specificity were 96.8 and 97.4%, respectively. The 'time orientation' and '3-word recall' score of MMSE were highly ranked features in predicting CI and cognitive state. The twelve features reduced from 46 variable of NPTs with age and education had contributed to more than 90% accuracy in predicting cognitive impairment.
    CONCLUSIONS: The machine learning algorithm for NPTs has suggested potential use as a reference in differentiating cognitive impairment in the clinical setting.
    Keywords:  Alzheimer’s disease; Dementia; Machine learning; Mild cognitive impairment; Neuropsychological test
    DOI:  https://doi.org/10.1186/s12911-019-0974-x
  15. JMIR Res Protoc. 2019 Nov 18. 8(11): e14245
    Piau A, Lepage B, Bernon C, Gleizes MP, Nourhashemi F.
      BACKGROUND: Most frail older persons are living at home, and we face difficulties in achieving seamless monitoring to detect adverse health changes. Even more important, this lack of follow-up could have a negative impact on the living choices made by older individuals and their care partners. People could give up their homes for the more reassuring environment of a medicalized living facility. We have developed a low-cost unobtrusive sensor-based solution to trigger automatic alerts in case of an acute event or subtle changes over time. It could facilitate older adults' follow-up in their own homes, and thus support independent living.OBJECTIVE: The primary objective of this prospective open-label study is to evaluate the relevance of the automatic alerts generated by our artificial intelligence-driven monitoring solution as judged by the recipients: older adults, caregivers, and professional support workers. The secondary objective is to evaluate its ability to detect subtle functional and cognitive decline and major medical events.
    METHODS: The primary outcome will be evaluated for each successive 2-month follow-up period to estimate the progression of our learning algorithm performance over time. In total, 25 frail or disabled participants, aged 75 years and above and living alone in their own homes, will be enrolled for a 6-month follow-up period.
    RESULTS: The first phase with 5 participants for a 4-month feasibility period has been completed and the expected completion date for the second phase of the study (20 participants for 6 months) is July 2020.
    CONCLUSIONS: The originality of our real-life project lies in the choice of the primary outcome and in our user-centered evaluation. We will evaluate the relevance of the alerts and the algorithm performance over time according to the end users. The first-line recipients of the information are the older adults and their care partners rather than health care professionals. Despite the fast pace of electronic health devices development, few studies have addressed the specific everyday needs of older adults and their families.
    TRIAL REGISTRATION: ClinicalTrials.gov NCT03484156; https://clinicaltrials.gov/ct2/show/NCT03484156.
    INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/14245.
    Keywords:  artificial intelligence; frailty; monitoring; older adults; participatory design; sensors
    DOI:  https://doi.org/10.2196/14245
  16. JMIR Mhealth Uhealth. 2019 Nov 18. 7(11): e15771
    Brar Prayaga R, Agrawal R, Nguyen B, Jeong EW, Noble HK, Paster A, Prayaga RS.
      BACKGROUND: Nonadherence among patients with chronic disease continues to be a significant concern, and the use of text message refill reminders has been effective in improving adherence. However, questions remain about how differences in patient characteristics and demographics might influence the likelihood of refill using this channel.OBJECTIVE: The aim of this study was to evaluate the efficacy of an SMS-based refill reminder solution using conversational artificial intelligence (AI; an automated system that mimics human conversations) with a large Medicare patient population and to explore the association and impact of patient demographics (age, gender, race/ethnicity, language) and social determinants of health on successful engagement with the solution to improve refill adherence.
    METHODS: The study targeted 99,217 patients with chronic disease, median age of 71 years, for medication refill using the mPulse Mobile interactive SMS text messaging solution from December 2016 to February 2019. All patients were partially adherent or nonadherent Medicare Part D members of Kaiser Permanente, Southern California, a large integrated health plan. Patients received SMS reminders in English or Spanish and used simple numeric or text responses to validate their identity, view their medication, and complete a refill request. The refill requests were processed by Kaiser Permanente pharmacists and support staff, and refills were picked up at the pharmacy or mailed to patients. Descriptive statistics and predictive analytics were used to examine the patient population and their refill behavior. Qualitative text analysis was used to evaluate quality of conversational AI.
    RESULTS: Over the course of the study, 273,356 refill reminders requests were sent to 99,217 patients, resulting in 47,552 refill requests (17.40%). This was consistent with earlier pilot study findings. Of those who requested a refill, 54.81% (26,062/47,552) did so within 2 hours of the reminder. There was a strong inverse relationship (r10=-0.93) between social determinants of health and refill requests. Spanish speakers (5149/48,156, 10.69%) had significantly lower refill request rates compared with English speakers (42,389/225,060, 18.83%; X21 [n=273,216]=1829.2; P<.001). There were also significantly different rates of refill requests by age band (X26 [n=268,793]=1460.3; P<.001), with younger patients requesting refills at a higher rate. Finally, the vast majority (284,598/307,484, 92.23%) of patient responses were handled using conversational AI.
    CONCLUSIONS: Multiple factors impacted refill request rates, including a strong association between social determinants of health and refill rates. The findings suggest that higher refill requests are linked to language, race/ethnicity, age, and social determinants of health, and that English speakers, whites, those younger than 75 years, and those with lower social determinants of health barriers are significantly more likely to request a refill via SMS. A neural network-based predictive model with an accuracy level of 78% was used to identify patients who might benefit from additional outreach to narrow identified gaps based on demographic and socioeconomic factors.
    Keywords:  Medicare patients; SMS; conversational AI; health disparities; machine learning; medication adherence; predictive modeling; refill adherence; social determinants of health; text messaging
    DOI:  https://doi.org/10.2196/15771