bims-arines Biomed News
on AI in evidence synthesis
Issue of 2026–01–04
eight papers selected by
Farhad Shokraneh, Systematic Review Consultants LTD



  1. Neuroimage Rep. 2026 Mar;6(1): 100306
      Online, text-based meta-analysis tools for large databases represent a new digital advance for medical, health, and neuroscience research, among other fields. NeuroQuery is an instance of such a tool for neuroimaging research; it employs supervised machine learning to draw from over 13,000 publications and perform a meta-synthesis, generating predictive fMRI scans based on keyword combinations. Although NeuroQuery is a sophisticated tool, a lack of understanding of how it practically works and its limitations may lead to flawed results and conclusions, undermining its potential value. We review potential risks and limitations, including algorithm limitations, potential biases in the database, and user misinterpretation. Simulating the perspective of an end user, we present an example of unreliable but possible metanalysis results on autistic spectrum disorder (ASD). We then report an analysis of the underlying query from a sophisticated user perspective. Using the same examples, we illustrate possible improvements for the use of NeuroQuery and identify how this tool may be valuable in the context of emerging machine-learning meta-analytical approaches. Although a thorough understanding of NeuroQuery is helpful, we conclude that understanding its limitations plays a more critical role in ensuring validity and reliability of its use. While NeuroQuery is currently not appropriate for rigorous scientific analysis, it could be useful for hypothesis development, preliminary fMRI data mining, exploratory and supplemental analysis as well as literature survey.
    Keywords:  ASD; Autism spectrum disorder; Machine learning; Meta-analysis; Neural networks; NeuroQuery; Neuroscience
    DOI:  https://doi.org/10.1016/j.ynirp.2025.100306
  2. Biol Methods Protoc. 2025 ;10(1): bpaf088
      The integration of computational methods with traditional qualitative research has emerged as a transformative paradigm in healthcare research. Computational Grounded Theory (CGT) combines the interpretive depth of grounded theory with computational techniques including machine learning and natural language processing. This systematic review examines CGT application in healthcare research through analysis of eight studies demonstrating the method's utility across diverse contexts. Following systematic search across five databases and PRISMA-aligned screening, eight papers applying CGT in healthcare were analyzed. Studies spanned COVID-19 risk perception, medical AI adoption, mental health interventions, diabetes management, women's health technology, online health communities, and social welfare systems, employing computational techniques including Latent Dirichlet Allocation (LDA), sentiment analysis, word embeddings, and deep learning algorithms. Results demonstrate CGT's capacity for analyzing large-scale textual data (100 000+ documents) while maintaining theoretical depth, with consistent reports of enhanced analytical capacity, latent pattern identification, and novel theoretical insights. However, challenges include technical complexity, interpretation validity, resource requirements, and need for interdisciplinary expertise. CGT represents a promising methodological innovation for healthcare research, particularly for understanding complex phenomena, patient experiences, and technology adoption, though the small sample size (8 of 892 screened articles) reflects its nascent application and limits generalizability. CGT represents a promising methodological innovation for healthcare research, particularly valuable for understanding complex healthcare phenomena, patient experiences, and technology adoption. The small sample size (8 of 892 screened articles) reflects CGT's nascent application in healthcare, limiting generalizability. Future research should focus on standardizing methodological procedures, developing best practices, expanding applications, and addressing accessibility barriers.
    Keywords:  computational grounded theory; digital health; natural language processing; qualitative methods; topic modeling
    DOI:  https://doi.org/10.1093/biomethods/bpaf088
  3. West J Nurs Res. 2025 Dec 29. 1939459251406413
       BACKGROUND: Artificial intelligence (AI) is increasingly used in qualitative research to support tasks such as coding, thematic identification, and pattern recognition. While AI enhances productivity in processing large volumes of unstructured data, it presents challenges that include limited contextual understanding, algorithmic bias, ethical concerns, and potential privacy issues.
    OBJECTIVE: This article aims to explore the integration of AI-assisted tools into qualitative content analysis, focusing on methodological rigor, ethical standards, and how AI tools can effectively support, rather than replace, human insight.
    METHODS: We designed an AI-assisted, thematic content analysis study of an animal-assisted therapy with farm animal's intervention involving domesticated ducks to support individuals with traumatic brain injury. We utilized Insight7, an AI software, chosen for its thematic detection capabilities and robust data security. Human researchers and the AI independently analyzed semistructured exit interview transcripts to allow for comparative validation of findings.
    RESULTS: There was general agreement between human and AI-generated themes. However, the AI occasionally misclassified typical participant experiences as challenges, highlighting the tool's limitations in contextual interpretation. Human oversight proved essential in ensuring accurate and nuanced data analysis.
    CONCLUSIONS: AI offers valuable support in qualitative research, especially in handling large datasets. However, its limitations underscore the importance of human involvement to maintain interpretive accuracy and ethical integrity. Future research should refine AI tools to enhance their contextual sensitivity while preserving the foundational role of human interpretation in qualitative inquiry.
    Keywords:  AI utilization in qualitative research; animal-assisted therapies; artificial intelligence; content analysis; thematic analysis
    DOI:  https://doi.org/10.1177/01939459251406413
  4. Science. 2026 Jan;391(6780): 5
      It's hard to talk about any topic in science or education today without the subject of artificial intelligence (AI) coming up-whether large language models should be allowed to aid in searching for a scientific paper or even to write or review the paper itself. In some of the wildest speculations, the humans involved in conducting scientific studies and experiments and vetting the results for publication will be steadily eliminated from the process. But when such grandiose rhetoric starts flying, we at Science try to keep calm and carry on in contributing to a robust, human-curated research literature that will stand the test of time.
    DOI:  https://doi.org/10.1126/science.aee8267
  5. Nature. 2026 Jan;649(8095): 7
      
    Keywords:  Computer science; Ethics; Machine learning; Society
    DOI:  https://doi.org/10.1038/d41586-025-04106-0
  6. Syst Rev. 2025 Dec 30. 14(1): 254
       BACKGROUND: Advanced practice nurses play a vital role in healthcare innovation, delivering high-quality care and improving patient outcomes. Leadership is a core competency of advanced practice nurses, empowering them to drive systemic improvements and foster collaboration. However, these master-level educated nurses often encounter challenges in assuming leadership roles, including limited recognition and competing demands on their time. The growing volume of healthcare-related research, combined with the lack of a comprehensive evidence base on the determinants and outcomes of their leadership behaviours, complicates the development of effective programmes. This protocol outlines a systematic approach to addressing these challenges, using an AI tool to efficiently manage the expanding evidence base and provide a detailed understanding of the factors influencing advanced practice nurses' leadership behaviours.
    METHODS: This protocol follows the PRISMA-P 2015 guidelines to outline a systematic review investigating the determinants and outcomes of advanced practice nurses' leadership behaviours. It employs the SPIDER tool for eligibility criteria, encompassing studies that explore advanced practice nursing leadership behaviours and their determinants and outcomes. Eligible studies include quantitative, qualitative and mixed-methods research, focusing on advanced practice nursing roles. The protocol also outlines a workflow for AI-aided title and abstract screening using ASReview LAB, incorporating multi-phase human validation to ensure accuracy and reliability. Data synthesis will utilise narrative synthesis for quantitative data and meta-aggregation for qualitative findings, integrating results through narrative weaving.
    DISCUSSION: This protocol addresses a critical gap in nursing research by systematically exploring the determinants influencing advanced practice nurses' leadership behaviours and their outcomes. It provides evidence to inform the development of tailored programmes aimed at empowering advanced practice nurses to maximise their leadership potential. Additionally, the protocol demonstrates how AI tools can enhance systematic review efficiency while maintaining methodological rigour. The findings will not only contribute to advancing nursing practice but also highlight the transformative potential of AI in research synthesis, ensuring timely and robust evidence generation amidst the expanding volume of healthcare-related research.
    SYSTEMATIC REVIEW REGISTRATION: PROSPERO CRD42025644174.
    Keywords:  Advanced practice nursing; Artificial intelligence; Behavioural research; Leadership; Machine learning; Systematic review
    DOI:  https://doi.org/10.1186/s13643-025-02939-4
  7. Front Pediatr. 2025 ;13 1659812
       Background: Artificial intelligence (AI), particularly AI-based large language models (LLM) like ChatGPT, is increasingly shaping how information is accessed, offering patients a new source for understanding complex medical conditions. Given the physical, emotional, and logistical challenges that parents are faced when their baby is diagnosed with developmental dysplasia of the hip (DDH), the demand for clear and accessible educational resources is high. This study aimed to evaluate the quality and reliability of ChatGPT's responses to frequently asked questions about DDH.
    Methods: This study assessed the quality of responses generated by the AI chatbot ChatGPT 4o to eight frequently asked questions about DDH, derived from real consultations in a pediatric orthopedic clinic Responses were generated during one interaction per question using a ChatGPT account not previously exposed to medical information. Responses were evaluated by two individual readers using a standardized rating system, comparing them to current literature, patient education resources, and consensus guidelines. Each response was categorized by its level of informational accuracy and completeness, and descriptive statistics were calculated to quantify performance.
    Results: ChatGPT 4o was able to generate structured responses to all eight parental questions. The responses were rated in 12.5% excellent, 25.0% satisfactory with minimal clarification, 50.0% satisfactory with moderate clarification, and 12.5% unsatisfactory due to missing or inaccurate information.
    Conclusion: ChatGPT provided satisfactory answers to questions about DDH and may serve as a useful supplementary information resource for parents. However, due to limitations in presenting detailed diagnostic and treatment pathways, it should be viewed as an adjunct to, not a replacement for, specialist medical consultation.
    Keywords:  artificial intelligence; developmental dysplasia of the hip; hip screening; infant hip; large language model
    DOI:  https://doi.org/10.3389/fped.2025.1659812
  8. Dis Colon Rectum. 2025 Dec 29.
       BACKGROUND: ChatGPT, an artificial intelligence large language model chatbot, transforms how patients obtain information regarding health concerns including sensitive questions.
    OBJECTIVE: Assess and compare the accuracy, completeness, and consistency of ChatGPT-3.5, 4, 5 and 5 Plus's answers to common questions regarding fecal incontinence.
    DESIGN: Thirty questions written in lay language based on American Society of Colon and Rectal Surgeons Clinical Practice Guidelines for fecal incontinence were presented in sequential order twice to all ChatGPT versions. Question categories included general/background, diagnosis, treatment, and miscellaneous. Three board certified professors of colorectal surgery with expertise treating fecal incontinence rated the answers "yes" or "no" for accuracy, completeness, and consistency with guidelines. A "no" prompted a free text response. Quantitative and qualitative analysis was performed.
    SETTINGS: ChatGPT-3.5, ChatGPT-4, ChatGPT-5 (free access), ChatGPT-5 Plus (paid subscription).
    INTERVENTION: Patient questions.
    MAIN OUTCOME MEASURES: Accuracy, completeness and consistency with practice guidelines.
    RESULTS: Reviewers rated 61% of answers accurate, 65% complete, and 68% consistent for ChatGPT-3.5, 72%, 73%, and 69% for ChatGPT-4, 50%, 73%, 68% for ChatGPT-5 free, and 83%, 95%, and 82% for ChatGPT-5, respectively. Three questions triggered ChatGPT's content warning, flagging them as inappropriate and terminating the chat. Qualitative analyses revealed 10 emergent sub-themes; the most frequent was inaccuracy of treatment recommendations.
    LIMITATIONS: The current set of chatbots is not intended for medical use.
    CONCLUSIONS: No version of ChatGPT provided answers that were entirely accurate, complete, or consistent with clinical practice guidelines, however the paid version performed markedly better than the rest. Analysis of ChatGPT-5 free vs Plus highlighted a dimension of disparity introduced by paywall-contingent model performance. Our study emphasizes the necessity for patient and provider education on the positives and pitfalls of this technology regarding health information. See Video Abstract.
    Keywords:  Artificial intelligence; Clinical Practice Guidelines; Fecal incontinence; patient education
    DOI:  https://doi.org/10.1097/DCR.0000000000004092