bims-librar Biomed News
on Biomedical librarianship
Issue of 2025–05–25
29 papers selected by
Thomas Krichel, Open Library Society



  1. Int J Med Inform. 2025 May 09. pii: S1386-5056(25)00182-0. [Epub ahead of print]202 105965
       INTRODUCTION: Digital point-of-care information resources are frequently used by clinicians to answer clinical questions. An evidence-based disease management database (DynaMed) was merged with a pharmaceutical knowledge base (Micromedex). We evaluated the ability of the combined solution, DynaMedex, to answer clinical questions.
    METHODS: Real-world questions were used for testing and were categorized by information type and specialty area. Two pharmacists independently performed 600 searches for 300 questions, using keyword search and Watson Assistant (WA). Search results were evaluated based on whether information was found (yes, no), relevance to the question (relevant, not relevant), difficulty in finding the answer (easy, medium, hard), and quality of the evidence (good, fair, poor).
    RESULTS: An answer was found 86.3% of the time using keyword search and 81.0% of the time using WA. In keyword searches, 86.0% of answers were considered relevant and 74.5% in WA. Most answers were easy to find (78.7% in keyword search, 94.4% in WA). The quality of evidence for answers was good, fair, or poor in 62.7%, 36.4%, and 0.9% for keyword search and 50.3%, 47.8%, and 1.9% for WA.
    CONCLUSION: Pharmacists found answers to most clinical questions easily with good quality, evidence-based information and a high agreement rate. This resource could be further improved by recognizing different search terms, standardizing the location of drug and disease information in appropriate sections, providing citations, if available, with the highest quality evidence, and including access to content types that haven't been incorporated.
    Keywords:  Drug information; Human factors; Medical information science; Searching behavior
    DOI:  https://doi.org/10.1016/j.ijmedinf.2025.105965
  2. Med Ref Serv Q. 2025 May 19. 1-23
      Access to precise and reliable scientific evidence is one of the fundamental principles of Evidence-Based Medicine (EBM) in clinical decision-making processes. Medical librarians, by employing advanced search and information retrieval techniques, play a pivotal role in accessing such evidence. This observational study compared the search and evidence retrieval behaviors of two groups: Medical librarians and medical specialists familiar with EBM and systematic reviews. The study population consisted of 40 participants (20 medical librarians and 20 medical specialists), whose performance in retrieving the best available evidence from credible sources was evaluated using two distinct clinical scenarios. A researcher-developed checklist was created in accordance with the Guidelines for Evaluating Evidence-Based Search Strategies and was utilized to assess the search performance of participants. The findings revealed that medical librarians employed structured search strategies and were more successful in retrieving accurate evidence. They consistently utilized structured search strategies, field-specific search tools, and narrowing techniques in all cases. In contrast, medical specialists spent less time on searches and exhibited a greater tendency to use natural language terms in their search queries. medical specialists did not systematically employ controlled vocabulary or place keywords in specific fields, such as titles, keywords, or abstracts. In conclusion, librarians' expertise in accessing the best available evidence underscores their crucial role in supporting medical specialists in obtaining and implementing evidence, thereby improving the quality and reliability of evidence-based practices in healthcare settings.
    Keywords:  Evidence-based medicine; information retrieval; information seeking behavior; medical librarians
    DOI:  https://doi.org/10.1080/02763869.2025.2471886
  3. Health Info Libr J. 2025 May 20.
      CILIP's second Green Libraries Conference was held at the British Library on 25 November 2024. HLG's Sustainability Leads Janine Hall and Adam Tocock report on the programme and its relevance to healthcare library staff.
    Keywords:  continuing professional development (CPD); costs and cost analysis; health care; health disparities; libraries; public health
    DOI:  https://doi.org/10.1111/hir.12571
  4. Am J Nurs. 2025 Jun 01. 125(6): 40-46
       ABSTRACT: This is the sixth article in a new series designed to provide readers with insight into educating nurses about evidence-based decision-making (EBDM). It builds on AJN's award-winning previous series-Evidence-Based Practice, Step by Step and EBP 2.0: Implementing and Sustaining Change (to access both series, go to https://links.lww.com/AJN/A133). This follow-up series on EBDM will address how to teach and facilitate learning about the evidence-based practice (EBP) and quality improvement (QI) processes, and how they impact health care quality. This series is relevant for all nurses interested in EBP and QI, especially DNP faculty and students. The brief case scenario included in each article describes one DNP student's journey. To access previous articles in this EBDM series, go to https://links.lww.com/AJN/A256.
    DOI:  https://doi.org/10.1097/AJN.0000000000000084
  5. Am J Nurs. 2025 Jun 01. 125(6): 60-63
       ABSTRACT: Clinical nurses are responsible for ensuring their practice aligns with current evidence-based practice (EBP). They are also increasingly involved in participating in and leading research. Hospital-based nurse scientists help bring EBP to clinical nurses and engage clinical nurses in research. Ensuring robust access to current literature and building upon the current evidence by further scholarly dissemination requires close collaboration between nurse scientists and medical librarians. Expanding opportunities for clinical nurses to participate in evidence-based inquiry and research can be realized through strong collaborative relationships with nurse scientists and medical librarians. However, significant barriers impede nurses from participating in or leading research. Here, we describe several strategies we implemented at our academic level 1 trauma center to ensure clinical nurses can participate and lead nursing research.
    Keywords:  evidence-based practice; medical librarians; nurse scientist; nursing research; nursing science
    DOI:  https://doi.org/10.1097/AJN.0000000000000095
  6. Am J Nurs. 2025 Jun 01. 125(6): 47
      An integral partner to the DNP educational team.
    DOI:  https://doi.org/10.1097/AJN.0000000000000081
  7. Health Info Libr J. 2025 May 20.
      Networks draw people together, allow us to share ideas and best practices, and make connections across a wide range of library and knowledge services. However, 'networking' is a term that often brings people out into a cold sweat (particularly for the introverts amongst us), conjuring up awkward small talk over drinks breaks at professional events! In CILIP's Health Libraries Group (HLG), we recognise that our rich network of members, across a wide range of organisations, is one of our greatest strengths. In this editorial, we introduce our new shadowing network that aims to connect our members and offer cross-sectoral reflection and development, for individuals and for library and knowledge services. We encourage you to add your library to the network so that together we can expand our horizons and develop as individuals and a sector.
    Keywords:  careers; continuing professional development (CPD); professional associations
    DOI:  https://doi.org/10.1111/hir.12567
  8. J Assist Reprod Genet. 2025 May 19.
       OBJECTIVES: This review article outlines the key aspects of electronic search strategies used for systematic reviews, with a particular focus on developing a search strategy for systematic reviews in reproductive medicine. Additionally, we aimed to gather information on studies assessing the quality of literature searches and address conceptualization, search terms, database selection, peer review, translation, documentation, and report of searches. This review and practical guide has been written to assist not only those with experience and knowledge in health research but also beginner teams seeking the skills to conduct systematic reviews in the field. It uses the MEDLINE database, with both PubMed and Ovid interfaces, as examples to illustrate the process of developing a search strategy.
    METHODS: A narrative review of the literature was conducted, and a practical guide for developing search strategies was developed.
    RESULTS: There is a significant lack of information on the quality and effectiveness of search strategies used in systematic reviews within the field of reproductive medicine, as well as on the workflow for developing these strategies. For specialized topics, searching at least three to five databases is recommended to achieve high recall. It is also advisable to follow the PRESS guidelines and to report the databases searched, the date of access, and terms used.
    DISCUSSION: This review may serve as a foundation for future research to address these gaps. We provided a concise and practical overview of the key elements of search strategy development as a kick-off. The appendices, which include practical examples, a compilation of existing sources, guidelines, and a glossary of terms, can be useful for health professionals and researchers in creating a more advanced and reproducible literature search when planning a systematic review project.
    Keywords:  ART; Evidence synthesis; Recall and precision; Reproductive medicine; Search strategy; Systematic reviews
    DOI:  https://doi.org/10.1007/s10815-025-03498-2
  9. Comput Biol Med. 2025 May 17. pii: S0010-4825(25)00714-0. [Epub ahead of print]192(Pt B): 110363
       BACKGROUND: Existing tools for reference retrieval using large language models (LLMs) frequently generate inaccurate, gray literature or fabricated citations, leading to poor accuracy. In this study, we aim to address this gap by developing a highly accurate reference retrieval system focusing on the precision and reliability of citations across five medical fields.
    METHODS: We developed open-source multi-AI agent, literature review, and citation retrieval agents (LITERAS) designed to generate literature review drafts with accurate and confirmable citations. LITERAS integrates a search through the largest biomedical literature database (MEDLINE) via PubMed's application programming interface and bidirectional inter-agent communication to enhance citation accuracy and reliability. To evaluate its performance, we compared LITERAS to state-of-the-art LLMs, Sonar and Sonar-Pro by Perplexity AI. The evaluation covered five distinct medical disciplines Oncology, Cardiology, Rheumatology, Psychiatry, and Infectious Diseases/Public Health, focusing on the credibility, precision, and confirmation of citations, as well as the overall quality of the referenced sources.
    RESULTS: LITERAS achieved near-perfect citation accuracy (i.e., whether references match real publications) at 99.82 %, statistically indistinguishable from Sonar (100.00 %, p = 0.065) and Sonar-Pro (99.93 %, p = 0.074). When focusing on referencing accuracy (the consistency between in-text citation details and metadata), LITERAS (96.81 %) significantly outperformed Sonar (89.07 %, p < 0.001) and matched Sonar-Pro (96.33 %, p = 0.139). Notably, LITERAS exclusively relied on Q1-Q2, peer-reviewed journals (0 % nonacademic content), whereas Sonar contained 35.60 % (p < 0.01) nonacademic sources, and Sonar-Pro used 6.47 % (p < 0.001). However, Sonar-Pro cited higher-impact journals than LITERAS (median impact factor (IF) = 14.70 vs LITERAS 3.70, p < 0.001). LITERAS's multi-agent loop (2.2 ± 1.34 iterations per query) minimized hallucinations and consistently prioritized recent articles (IQR = 2023-2024). The field-specific analysis demonstrated the oncology field with the largest IF discrepancies (Sonar-Pro 42.1 vs LITERAS 4.3, p < 0.001), reflecting Sonar-Pro's preference for major consortium guidelines and high-impact meta-analyses.
    CONCLUSION: LITERAS demonstrated significantly higher retrieval of recent academic journal articles and generating longer summary report compared to academic search LLMs approaches in literature review tasks. This work provides insights into improving the reliability of AI-assisted literature review systems.
    Keywords:  Agents; Artificial intelligence; Citations; Large language models; Literature review; Multi AI agents
    DOI:  https://doi.org/10.1016/j.compbiomed.2025.110363
  10. Res Social Adm Pharm. 2025 May 15. pii: S1551-7411(25)00250-5. [Epub ahead of print]
       INTRODUCTION: Medical Subject Headings (MeSH) are the controlled vocabulary used by the National Library of Medicine (NLM) to index articles covered by MEDLINE.
    OBJECTIVE: Evaluate the consistency of MeSH assignment using a test-retest analysis of articles published multiple times.
    METHODS: Three sets of articles that had been published multiple times were selected: Vancouver Group articles, CONSORT Statement articles, and Granada Statement articles. The articles publishing these position papers were searched in PubMed in February 2025, and their records were exported in XML format. The articles' metadata, the assigned MeSH terms, and the indexing methods were extracted. Consistency was assessed using Fleiss' kappa for inter-rater agreement and Krippendorff's alpha for classification reliability, considering each article as a different rater.
    RESULTS: A total of 6, 8, and 5 articles indexed in MEDLINE were retrieved that had published articles with Vancouver, CONSORT, and Granada statements, with 14, 6, and 10 different MeSH terms assigned, respectively. The first two sets of articles were manually indexed, while the Granada articles were automatically indexed. Fleiss' kappa for the MeSH terms assigned to the Vancouver, CONSORT, and Granada articles were -0.390, -0.370, and -0.333, respectively, and Krippendorff's alphas were 0.178, 0.525, and 0.183, respectively. "Periodicals as Topic" and "Randomized Controlled Trials as Topic" were used in all Vancouver and CONSORT articles, respectively. Except for "Humans," no other MeSH terms appeared in all Granada articles. The most prevalent terms were "Pharmacy" and "Pharmacies" and "Pharmacy Research." Geographic MeSH terms were assigned to the Vancouver and Granada articles.
    CONCLUSION: A highly inconsistent MeSH indexing pattern was found across the three sets of articles. Automated indexing of the Granada Statements articles did not improve the results.
    Keywords:  Abstracting and indexing; Manuscripts as topic; Medical subject headings; Periodicals as topic; Pharmaceutical services; Pharmacists
    DOI:  https://doi.org/10.1016/j.sapharm.2025.05.008
  11. JMIR Cancer. 2025 May 21. 11 e72522
       Background: Generative artificial intelligence (AI) chatbots may be useful tools for supporting shared prostate cancer (PrCA) screening decisions, but the information produced by these tools sometimes lack quality or credibility. "Prostate Cancer Info" is a custom GPT chatbot developed to provide plain-language PrCA information only from websites of key authorities on cancer and peer-reviewed literature.
    Objective: The objective of this paper was to evaluate the accuracy, completeness, and readability of Prostate Cancer Info's responses to frequently asked PrCA screening questions.
    Methods: A total of 23 frequently asked PrCA questions were individually input into Prostate Cancer Info. Responses were recorded in Microsoft Word and reviewed by 2 raters for their accuracy and completeness. Readability of content was determined by pasting responses into a web-based Flesch Kincaid Reading Ease Scores calculator.
    Results: Responses to all questions were accurate and culturally appropriate. In total, 17 of the 23 questions (74%) had complete responses. The average readability of responses was 64.5 (SD 8.7; written at an 8th-grade level).
    Conclusions: Generative AI chatbots, such as Prostate Cancer Info, are great starting places for learning about PrCA screening and preparing men to engage in shared decision-making but should not be used as independent sources of PrCA information because key information may be omitted. Men are encouraged to use these tools to complement information received from a health care provider.
    Keywords:  artificial intelligence; cancer screening; chatGPT; chatbot; generative artificial intelligence; prostate cancer; shared decision making
    DOI:  https://doi.org/10.2196/72522
  12. HSS J. 2025 May 20. 15563316251340697
      Background:The proliferation of artificial intelligence has led to widespread patient use of large language models (LLMs). Purpose: We sought to characterize LLM responses to questions about piriformis syndrome (PS). Methods: On August 15, 2024, we asked 3 LLMs-ChatGPT-4, Copilot, and Gemini-to respond to the 25 most frequently asked questions about PS, as tracked by Google Trends. We evaluated the accuracy and completeness of the responses according to the Likert scale. We used the Ensuring Quality Information for Patients (EQIP) tool to assess the quality of the responses and assessed readability using Flesch-Kincaid Reading Ease (FKRE) and Flesch-Kincaid Grade Level (FKGL) scores. Results: The mean completeness scores of the responses obtained from ChatGPT, Copilot, and Gemini were 2.8 ± 0.3, 2.2 ± 0.6, and 2.6 ± 0.4, respectively. There was a significant difference in the mean completeness score among LLMs. In pairwise comparisons, ChatGPT and Gemini were superior to Copilot. There was no significant difference between the LLMs in terms of mean accuracy scores. In readability analyses, no significant difference was found in terms of FKRE scores. However, a significant difference was found in FKGL scores. A significant difference between LLMs was identified in the quality analysis performed according to EQIP scores. Conclusion: Although the use of LLMs in healthcare is promising, our findings suggest that these technologies need to be improved to perform better in terms of accuracy, completeness, quality, and readability on PS for a general audience.
    Keywords:  artificial intelligence; large language models; piriformis syndrome
    DOI:  https://doi.org/10.1177/15563316251340697
  13. Breast Cancer. 2025 May 21.
       BACKGROUND: The internet is a primary source of health information for breast cancer patients, but online content quality varies widely. This study aimed to evaluate the capability of large language models (LLMs), including ChatGPT and Claude, to assess the quality of online Japanese breast cancer treatment information by calculating and comparing their DISCERN scores with those of expert raters.
    METHODS: We analyzed 60 Japanese web pages on breast cancer treatments (surgery, chemotherapy, immunotherapy) using the DISCERN instrument. Each page was evaluated by the LLMs ChatGPT and Claude, along with two expert raters. We assessed LLMs evaluation consistency, correlations between LLMs and expert assessments, and relationships between DISCERN scores, Google search rankings, and content length.
    RESULTS: Evaluations by LLMs showed high consistency and moderate to strong correlations with expert assessments (ChatGPT vs Expert: r = 0.65; Claude vs Expert: r = 0.68). LLMs assigned slightly higher scores than expert raters. Chemotherapy pages received the highest quality scores, followed by surgery and immunotherapy. We found a weak negative correlation between Google search ranking and DISCERN scores, and a moderate positive correlation (r = 0.45) between content length and quality ratings.
    CONCLUSIONS: This study demonstrates the potential of LLM-assisted evaluation in assessing online health information quality, while highlighting the importance of human expertise. LLMs could efficiently process large volumes of health information but should complement human insight for comprehensive assessments. These findings have implications for improving the accessibility and reliability of breast cancer treatment information.
    Keywords:  Artificial intelligence; Breast cancer; DISCERN instrument; Large language models; Online health information
    DOI:  https://doi.org/10.1007/s12282-025-01719-1
  14. J Exp Orthop. 2025 Apr;12(2): e70281
       Purpose: The purpose of this study was to analyze the quality, accuracy, reliability and readability of information provided by an Artificial Intelligence (AI) model ChatGPT (Open AI, San Francisco) regarding Distal Biceps Tendon repair surgery.
    Methods: ChatGPT 3.5 was used to answer 27 commonly asked questions regarding 'distal biceps repair surgery' by patients. These questions were categorized using the Rothwell criteria of Fact, Policy and Value. The answers generated by ChatGPT were analyzed using the DISCERN scale, Journal of the American Medical Association (JAMA) benchmark criteria, Flesch-Kincaid Reading Ease Score (FRES) and grade Level (FKGL).
    Results: The DISCERN score for Fact-based questions was 59, Policy was 61 and Value was 59 (all considered 'good scores'). The JAMA benchmark criteria were 0, representing the lowest score, for all three categories of Fact, Policy and Value. The FRES score for the Fact questions was 24.49, Policy was 22.82, Value was 21.77 and the FKGL score for Fact was 14.96, Policy was 14.78 and Value was 15.00.
    Conclusion: The answers provided by ChatGPT were a 'good' source in terms of quality assessment, compared to other online resources that do not have citations as an option. The accuracy and reliability of these answers were shown to be low, with nearly a college-graduate level of readability. This indicates that physicians should caution patients when searching ChatGPT for information regarding distal biceps repairs. ChatGPT serves as a promising source for patients to learn about their procedure, although its reliability and readability are disadvantages for the average patient when utilizing the software.
    Keywords:  artificial intelligence; distal biceps repair; patient education; technology in orthopaedic procedures
    DOI:  https://doi.org/10.1002/jeo2.70281
  15. BMC Oral Health. 2025 May 17. 25(1): 736
       BACKGROUND: This study focused on two Artificial Intelligence chatbots, ChatGPT 3.5 and Google Gemini, as the primary tools for answering questions related to traumatic dental injuries. The aim of this study to evaluate the reliability, understandability, and applicability of the responses provided by these chatbots to commonly asked questions from parents of children with dental trauma.
    METHODS: The case scenarios were developed based on frequently asked questions that parents commonly ask their dentists or Artificial Intelligence chatbots regarding dental trauma in children. The quality and accuracy of the information obtained from the chatbots were assessed using the DISCERN Instrument. The understandability and actionability of the responses obtained from the Artificial Intelligence chatbots were assessed using the Patient Education Materials Assessment Tool for Printed Materials. In statistical analysis; categorical variables were analyzed in terms of frequency and percentage. For numerical variables, skewness and kurtosis values were calculated to assess normal distribution.
    RESULTS: Both Artificial Intelligence chatbots performed similarly, although Google Gemini provided higher quality and more reliable responses. Based on the mean scores, ChatGPT 3.5 had a higher understandability. Both chatbots demonstrated similar levels of performance in terms of actionability.
    CONCLUSION: Artificial Intelligence applications can serve as a helpful starting point for parents seeking information and reassurance after dental trauma. However, they should not replace professional dental consultations, as their reliability is not absolute. Parents should use Artificial Intelligence applications as complementary resources and seek timely professional advice for accurate diagnosis and treatment.
    Keywords:  Artificial Intelligence; Chat GPT; Discern Instrument; Patient Education Materials Assessment Tool for Printed Materials
    DOI:  https://doi.org/10.1186/s12903-025-06105-z
  16. Cureus. 2025 Apr;17(4): e82705
       INTRODUCTION: Varicella, hand, foot, and mouth disease (HFMD), and measles are some of the common causes of fever with rash in the pediatric age group. ChatGPT and Gemini are effective large language models (LLMs) for parents to understand their child's condition. Therefore, considering the growing popularity of artificial intelligence (AI), LLMs, and their ability to disseminate health information, assessing ChatGPT's (OpenAI, San Francisco, CA, USA) and Gemini's (Google LLC, Mountain View, CA, USA) quality and accuracy is essential.
    MATERIALS AND METHODS: A cross-sectional study was conducted on responses generated using AI for common causes of fever with rash in the pediatric age group, namely varicella, HFMD, and measles. ChatGPT and Gemini were used for the generation of brochures for patient education. The responses generated were evaluated using the Flesch-Kincaid Calculator (Good Calculators: https://goodcalculators.com/), the QuillBot plagiarism tool (QuillBot, Chicago, IL, USA), and the modified DISCERN score. Statistical analysis was done using R version 4.3.2 (R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/), and unpaired t-tests were used to compare the various scores. A p-value of less than 0.05 was considered statistically significant.
    RESULTS: It was found that ChatGPT generates a higher word count as compared to Gemini (p=0.047). Sentences, average words per sentence, average syllables per word, ease score, and grade level between the two AI tools were statistically insignificant (p>0.05). The mean reliability score was 3/5 in the case of Gemini versus 2.67/5 in ChatGPT, but the difference was statistically insignificant (p=0.725).
    CONCLUSIONS: This study highlights that ChatGPT generates more word count than Gemini, and the finding was statistically significant (p=0.047). Additionally, there is no significant difference in the average ease score or grade score for common pediatric exanthematous conditions: varicella, HFMD, and measles. Future research should focus on improving AI-generated health content by incorporating real-time validation mechanisms, expert reviews, and structured patient feedback.
    Keywords:  ai tools; artificial intelligence (ai); chatgpt; google gemini; patient education; pediatric; skin conditions
    DOI:  https://doi.org/10.7759/cureus.82705
  17. Database (Oxford). 2025 Feb 05. pii: baaf006. [Epub ahead of print]2025
      Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide, presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritize their efforts. In this work, we take a first step to alleviating the lack of curator time in RNA science by generating summaries of literature for noncoding RNAs using large language models (LLMs). We demonstrate that high-quality, factually accurate summaries with accurate references can be automatically generated from the literature using a commercial LLM and a chain of prompts and checks. Manual assessment was carried out for a subset of summaries, with the majority being rated extremely high quality. We apply our tool to a selection of >4600 ncRNAs and make the generated summaries available via the RNAcentral resource. We conclude that automated literature summarization is feasible with the current generation of LLMs, provided that careful prompting and automated checking are applied. Database URL: https://rnacentral.org/.
    DOI:  https://doi.org/10.1093/database/baaf006
  18. Urogynecology (Phila). 2025 Jun 01. 31(6): 612-618
       IMPORTANCE: No study has evaluated the health information available on the internet regarding obstetric anal sphincter injury.
    OBJECTIVES: The aim of this study was to assess the quality and accessibility of information on the internet for patients regarding obstetric anal sphincter injury.
    STUDY DESIGN: This cross-sectional study analyzed online obstetric anal sphincter injury health information through a Google search, collecting the top 20 websites for 9 medical and lay terms. Quality was evaluated using the DISCERN score and the Journal of the American Medical Association benchmark criteria. Reading level was determined using the Flesch-Kincaid readability test. Mean DISCERN scores were compared using the Kruskal-Wallis test.
    RESULTS: One hundred eleven unique websites were identified; 46.8% (n = 52) were directed toward medical professionals, and 9% (n = 10) were for law firms or e-commerce sites. Of the patient-facing websites, 24.3% (n = 27) were from health organizations outside of the United States. The DISCERN scores ranged from 16 to 77. Only 18% of websites met all 4 benchmark criteria; 10.8% (n = 12) of websites were inaccessible without subscriptions to journals or databases.
    CONCLUSIONS: Obstetric anal sphincter injury health information online is of generally high quality, but primarily for medical professionals. Terms like "anal sphincter injury" required a 15th-grade reading level. While some terms yielded more patient-facing information, their reading levels remained above recommended levels for patients. This study highlights the paucity of broadly-accessible patient-facing obstetric anal sphincter injury resources on the internet because of variable quality, inability to perform credibility assessment, and physical and readability accessibility barriers.
    DOI:  https://doi.org/10.1097/SPV.0000000000001513
  19. J Robot Surg. 2025 May 22. 19(1): 228
      Robotic approaches have gained popularity in recent years, with multiple studies showing improved short- and long-term outcomes with this technique for esophagectomy. Educational resources should be assessed to ensure patients are knowledgeable about the treatment modalities that are available. Our aim is to evaluate whether online content is a reliable source of patient educational material for robotic esophagectomy. A YouTube query was performed for: "Robot Assisted Minimally Invasive Esophagectomy." The first 60 videos were evaluated by two independent reviewers and scored using the DISCERN tool. Of the 60 videos reviewed, 48 (80%) were included. The average DISCERN score for the videos was 1.3 ± 0.57 (SD), with a score > 3 being good for patient education and ≤ 3 being poor. The content available on YouTube for education about robotic esophagectomy is better suited for surgical education. This underscores a significant opportunity to improve patient education resources for the betterment of shared decision making.
    Keywords:  Educational resources; Esophagectomy; Patient education; Robotic; YouTube
    DOI:  https://doi.org/10.1007/s11701-025-02297-2
  20. Eur Spine J. 2025 May 23.
       PURPOSE: Patients frequently utilize internet-based resources to seek information. Cervical disc replacement (CDR) is extensively marketed on the internet and patients may research their condition in preference to a fusion. Previous literature has recommended that the readability of patient education materials (PEM) should not exceed the 6th-grade(11-12 years old) United States reading level to optimize health literacy. This study aims to evaluate the readability of online patient education materials concerning cervical disc replacement.
    METHODS: A Google search query was performed on March 1st, 2025 using the term" Cervical Disc Replacement patient information." The first 25 websites meeting study inclusion criteria were analyzed for readability using Flesch-Kincaid, average reading level consensus, Gunning Fog, Coleman-Liau, SMOG, and Linsear Write indices. Descriptive statistics were reported.
    RESULTS: The mean of the average reading levels by Flesch-Kincaid was 11.5 (1.29). The mean Flesch Kincaid Reading Ease score was 48.8 (8.12). The mean Gunning Fog Score was 12.9 (1.39), Flesch Kincaid grade level 10.6 (1.69), Coleman Liau 12.4 (1.48), SMOG 10.5 (1.3), Automated Readability Index 11.5 (2.12), Linear Write 69.4 (6.62). Zero PEMs were at or below the recommended sixth grade(11-12 years old) United States reading level. Four of the PEMs were considered general health information(GHI), twenty-one were considered clinical practice(CP). No differences were found between CP and GHI websites (P > 0.05).
    CONCLUSION: Creating appropriate patient education materials is integral to achieving optimal health literacy. The current readability of the most accessible PEMs related to CDR is inadequate. As it stands, many patients may not appropriately comprehend the description of their anticipated surgery.
    Keywords:  Cervical disc replacement; Online health information; Patient education material; Readability
    DOI:  https://doi.org/10.1007/s00586-025-08942-6
  21. Medicine (Baltimore). 2025 May 16. 104(20): e42445
      The purpose of this study is to assess the quality, reliability, and usefulness of anesthesia-related YouTube videos for obese patients by analyzing their content and evaluating their usefulness based on the source of the video. This research analyzed the top 108 most-watched YouTube videos tagged with "bariatric anesthesia" and "anesthesia in obese patients." We recorded data such as the upload year, number of views, days since upload, daily average views, length of videos, and the number of likes and comments. Videos were grouped into 3 categories: healthcare institutions, educational institutions, and personal websites. The accuracy and reliability of the videos were evaluated using standards set by the American Medical Association. The quality was assessed via a global quality scale (GQS), and usefulness was determined through a newly developed obesity anesthesia benefit index (OABI). The general usefulness of these videos was rated low. There was a notable correlation between the OABI usefulness scores and both the video durations and the time since upload (P = .000; P = .037). Videos from educational institutions notably scored higher on the GQS and according to the American Medical Association standards compared to other sources (P < .000; P < .000). Despite these differences, 75% of videos from healthcare institutions, 61.9% from educational institutions, and 76.5% from personal sources were classified as minimally useful according to the OABI. Our evaluation using the obesity anesthesia benefit index revealed that the usefulness of the YouTube videos was lower than initially hypothesized.
    Keywords:  anesthesia; education; health literacy; medical; obesity; social media
    DOI:  https://doi.org/10.1097/MD.0000000000042445
  22. Eur J Ophthalmol. 2025 May 23. 11206721251343656
      PurposeThis study aimed to evaluate the educational quality and reliability of YouTube videos on Behçet's uveitis, focusing on videos created by physicians and other sources.MethodsA YouTube search using "Behcet's uveitis" was conducted on October 12, 2024, with the first 50 relevant videos included. All video searches were conducted without any user login and after clearing the entire search history, and the first 50 videos were selected based on relevance. Two specialist ophthalmologists independently evaluated the videos using three scoring systems: Journal of the American Medical Association (JAMA), DISCERN, and Health on the Net (HON). Inter-rater reliability was assessed using Cohen's Kappa test, yielding strong agreement for JAMA (κ = 0.86), DISCERN (κ = 0.81), and HON (κ = 0.83). Data on video views, upload time, length, likes/dislikes, authorship, and comments were collected. For statistical analysis, means ± SD were used for JAMA, DISCERN, and HON scores. Linear regression assessed relationships between video features and scores. Independent t-tests evaluated authorship's effect on JAMA, DISCERN, and HON scores.ResultsThe mean video views were 10,660, with an average length of 1,387 s and 29.1 comments per video. Consensus JAMA, DISCERN, and HON scores were 1.90 ± 0.68, 46.98 ± 1.16, and 3.59 ± 0.65, indicating low quality. Videos by physicians scored higher across all metrics. No other factors significantly affected quality.ConclusionYouTube videos on Behçet's uveitis generally lack educational quality and reliability. Patients should prioritize videos created by physicians to access accurate and reliable information.
    Keywords:  Behcet's uveitis; DISCERN; ophthalmology; patient education; youtube videos
    DOI:  https://doi.org/10.1177/11206721251343656
  23. Int Ophthalmol. 2025 May 17. 45(1): 200
       PURPOSE: To evaluate the quality and reliability of YouTube videos involving SMILE (Small Incision Lenticule Extraction) laser surgery and to assess the differences between physician-authored and patient-generated content.
    METHODS: A comprehensive search was conducted on YouTube for SMILE laser surgery videos. The top 100 English-language videos were evaluated using validated scoring systems including DISCERN, JAMA, Global Quality Score (GQS), and Patient Education Materials Assessment Tool for Audio Visual (PEMAT-A/V).
    RESULTS: The average quality scores were moderate across all metrics (DISCERN: 45.1, PEMAT-AV Understandability: 68.8, PEMAT-AV Actionability: 49.9, JAMA: 2.39, GQS: 2.99). Physician-authored videos scored significantly higher on reliability, treatment choices, and overall quality (p < 0.05). Patient-generated videos demonstrated higher viewer engagement, with more views (23,973 vs. 15,467) and comments (28 vs. 8). Positive correlations were determined between video length, likes, comments, and quality metrics (p < 0.05).
    CONCLUSIONS: The SMILE laser surgery information on YouTube is of moderate quality. While physician-authored videos provide higher quality content, patient-generated videos offer unique perspectives and greater engagement. There is a need for balanced, high-quality video content that incorporates both professional expertise and patient experiences to provide comprehensive and reliable online information about SMILE laser surgery.
    Keywords:  Refractive surgery; SMILE laser surgery; Video quality assessment; YouTube
    DOI:  https://doi.org/10.1007/s10792-025-03568-5
  24. Digit Health. 2025 Jan-Dec;11:11 20552076251341090
       Background: Cardiopulmonary exercise testing (CPET) is conducted globally. On TikTok, CPET-related content serves as a key source of information for the public. However, the quality of these videos has not been systematically evaluated. This study aims to assess whether CPET videos on TikTok meet the informational needs of users.
    Methods: A cross-sectional analysis was performed on TikTok videos about CPET in China. Video sources were identified and analyzed. Content evaluation focused on CPET principles, indications, procedures, and indicator interpretation. The reliability and quality of the videos were assessed using four standardized tools: modified DISCERN, Global Quality Scale (GQS), JAMA benchmarks, and Patient Education Materials Assessment Tool for Audiovisual Materials (PEMAT-A/V). Misinformation was summarized, and the relationship between video quality and characteristics was examined.
    Results: Of the video sources, 43.8% were from physicians, 12.5% from nonphysicians, 12.5% from general users, 14.5% from news agencies, 12.5% from nonprofit organizations, and 4.2% from for-profit organizations. Median scores for modified DISCERN, GQS, JAMA, PEMAT-A/V understandability, and actionability were 1.00, 1.00, 2.00, 33.00, and 29.00, respectively. Videos by physicians had significantly higher modified DISCERN and JAMA scores compared to those by nonphysicians (p < 0.01). Likes, comments, collections, and shares positively correlated with quality scores. Common misinformation included exaggerated CPET roles, improper procedures, misinterpretation of indicators, and safety risks.
    Conclusions: The quality and reliability of CPET videos on TikTok are uncertain, with many containing significant misinformation. This problem largely stems from content creators' insufficient understanding of CPET. To address this, implementing standardized training and certification is necessary. Videos produced by physicians generally exhibit higher quality, highlighting the importance of strengthening their leadership in CPET teams. Furthermore, social media platforms should work with CPET providers and video creators to develop a certification system for medical information. These steps could improve video quality, reduce misinformation, and promote accurate CPET knowledge, ultimately benefiting public health.
    Keywords:  Cardiopulmonary exercise testing; TikTok; content quality; misinformation; public health
    DOI:  https://doi.org/10.1177/20552076251341090
  25. Eur J Ophthalmol. 2025 May 21. 11206721251344642
      PurposeTo evaluate the quality of healthcare-related information on laser refractive surgery (LRS), including laser in situ keratomileusis (LASIK) and photorefractive keratectomy (PRK), among healthcare professionals (HCP) and non-healthcare professionals (NHCP) on TikTok using the DISCERN criteria.Materials and MethodsIn this study, searches of the top 100 results each of "LASIK" and "PRK" were conducted. The video results of 154 LASIK and PRK videos were evaluated for user engagement and content quality using the DISCERN criteria.ResultsThe sources of LRS information were ophthalmologists (39.6%), optometrists (3.2%), and non-healthcare professionals (57.1%). User engagement had a combined 9.1 million likes, 79,000 comments, 187,000 shares, and 611,500 saves. DISCERN Criteria analysis revealed that videos by HCP had an average summation score of 34.03 (poor quality), with statistically significantly higher scores in 6 out of 15 categories, compared to 30.72 (poor quality) for videos by NHCP (p < 0.01). LASIK videos had higher viewership and user engagement than PRK videos. The PRK videos performed better in the additional treatment options category.DiscussionThe DISCERN criteria is useful for assessing the LRS video quality on TikTok. LRS is a popular topic on the platform, with LASIK topics being more popular than PRK topics. Although HCP scored higher in many DISCERN metrics, most videos among both HCP and NHCP were considered poor quality, and only a minority were considered fair or good quality. HCP should be cognizant of the quality of medical information produced and available to patients on social media platforms.
    Keywords:  LASIK; Laser refractive surgery; PRK; TikTok; healthcare information; social media education
    DOI:  https://doi.org/10.1177/11206721251344642
  26. BMC Public Health. 2025 May 23. 25(1): 1896
       BACKGROUND: Pediatric pneumonia remains a major global health concern, accounting for one of the leading causes of mortality among children under five years of age. With the prevalence of COVID-19, public attention to pediatric pneumonia has significantly increased. In recent years, short video platforms such as Bilibili, TikTok, and Kwai-boasting billions of global users-have emerged as critical channels for disseminating and accessing health-related information. This study systematically evaluates the quality and reliability of pediatric pneumonia-related short videos on the aforementioned three platforms.
    METHODS: We employed the Chinese keyword "Pneumonia in Children" to conduct searches on Bilibili, TikTok, and Kwai, selected the top 100 recommended related videos of each platform, and extracted and recorded the title, website, publisher, content, duration, days since published, and audience engagement metrics (Likes, Comments, Saves) of each video. The Global Quality Scale (GQS), modified DISCERN (mDISCERN), and Medical Quality Video Evaluation Tool (MQ-VET) were used to evaluate video quality and reliability. Finally, statistical analyses were conducted to compare quality differences among different platforms, different types of publishers, and different video content.
    RESULTS: Significant variations in audience engagement metrics (likes, comments, and saves) were observed across the three platforms (p < 0.01), with TikTok demonstrating the highest values for all metrics. The categorization of video content and publisher types exhibited statistically significant heterogeneity among the platforms (p < 0.001). Videos created by medical professionals exhibited significantly elevated quality and reliability assessment scores in comparison to content generated by non-medical practitioners (p < 0.001). Bilibili consistently achieved the highest scores across all evaluation tools (GQS, mDISCERN, and MQ-VET scores; p < 0.001), particularly for content produced by medical professionals. Compared with News and reports, videos focused on disease Disease knowledge and Treatment and prevention received significantly higher scores (p < 0.001). Notably, a negative correlation was identified between video quality scores and audience engagement metrics(p < 0.05).
    CONCLUSION: The overall quality of video content on the three platforms Bilibili, TikTok, and Kwai is average, with low reliability, among which Bilibili's video quality and reliability are higher than the other two platforms. Meanwhile, videos published by medical professionals have better quality and higher reliability.
    Keywords:  Bilibili; Cross-sectional study; Kwai; Pediatric pneumonia; Quality and reliability; Short videos; Tiktok
    DOI:  https://doi.org/10.1186/s12889-025-22963-2
  27. Asia Pac J Oncol Nurs. 2025 Dec;12 100700
       Objective: This study aimed to identify latent classes of online health information seeking (OHIS) behaviors among young women diagnosed with breast cancer in China and examine associated personal characteristics to support tailored health education strategies.
    Methods: Young women diagnosed with breast cancer were recruited from a cancer center in China between April and September 2024. Participants completed questionnaires on demographic and clinical characteristics, OHIS behaviors, psychosocial and cognitive factors, trust, social norms, communication, and information seeking experience. Latent class analysis (LCA) identified OHIS patterns, and multivariate logistic regression explored associated characteristics.
    Results: Among the 398 patients, the median number of topics sought was 5 (4-7). The most frequently sought topics related to breast cancer included basic knowledge (89.7%), treatment plans (77.6%), and lifestyle (75.4%). Nearly half sought information only a few times a month or less. Social media (82.7%) and official accounts/websites (71.1%) were the most frequently used sources. LCA revealed three OHIS behavior classes: Class 1 "information explorers" (26.4%), Class 2 "occasional seekers" (49.2%), and Class 3 "information experts" (24.4%). Patients in adjuvant or other treatment phases were more likely to belong to Class 2 than Class 1. Those with a longer time since diagnosis were also more likely to be classified into Class 2 or Class 3. Conversely, stage I patients and those who trusted online health information were more likely to belong to Class 1, while higher eHealth literacy was associated with Class 3 membership.
    Conclusions: Young women diagnosed with breast cancer display diverse OHIS patterns influenced by demographic and clinical factors. Recognizing these differences is vital for delivering tailored online health information services.
    Keywords:  Breast cancer; Health behavior; Latent class analysis; Online health information seeking behaviors
    DOI:  https://doi.org/10.1016/j.apjon.2025.100700
  28. Med Ref Serv Q. 2025 May 21. 1-8
      This column reintroduces Really Simple Syndication (RSS). RSS feeds offer a structured way to receive updates from journals, databases like PubMed and Ovid, and websites. Librarians and researchers can use it to keep track of field trends and facilitate discipline awareness. This overview covers how to retrieve RSS feeds, store them in Outlook and web-based readers, and best practices. Use cases for librarians and challenges are also included.
    Keywords:  Email; Ovid; PubMed; RSS; librarians; online searching; really simple syndication; review
    DOI:  https://doi.org/10.1080/02763869.2025.2499854
  29. Stud Health Technol Inform. 2025 May 15. 327 622-626
      Developed in 2020, the tala-med search engine provides high-quality, evidence-based health information from trustworthy German websites while ensuring user privacy. However, it still leaves room for technical improvements. In the present work, we are replacing the Fess crawler with Apache Nutch to improve scalability and customization, as Nutch offers greater flexibility for large-scale web crawling and indexing. The new system uses Docker to integrate five services: PostgreSQL for configurations, Nutch for crawling and indexing, and ElasticSearch for search operations; a Manager orchestrates the process with custom configurations via extended Nutch-REST interface. The configuration options include domain-specific URL filters to select the crawled content. A test crawl demonstrated the system's effectiveness, processing approximately 23k websites over 65 hours. Future work will focus on deploying the crawler long-term and generating a search index for further analysis. We have published our code under the MIT license at https://gitlab.com/mri-tum/aiim/search-platform/crawler.
    Keywords:  Data Systems; Health Information Systems; Information Storage and Retrieval; Online Systems; Search Engine; Software Design; Web-Crawler
    DOI:  https://doi.org/10.3233/SHTI250423