bims-librar Biomed News
on Biomedical librarianship
Issue of 2025–07–06
34 papers selected by
Thomas Krichel, Open Library Society



  1. Health Info Libr J. 2025 Jul 01.
      The 'International Book Collection' was established to foster cultural diversity among the staff community at University Hospitals Coventry and Warwickshire NHS Trust. The initiative focused on building a collection of world literature that reflects the traditional values, beliefs and cultural perspectives of the individuals working in the organisation. The key challenge encountered during the development of the collection was ensuring the true representation of the staff community within the organisation. In order to start putting the collection together, promotional strategies were designed incorporating the user's needs. In August 2024, the collection was launched as an initiative through which users can recommend books that represent their cultural heritage. It has remained active since August 2024 and, as of June 2025, a total of 32 recommendations have been received.
    Keywords:  collection development; health disparities; health libraries; learning; library services; marketing and publicity
    DOI:  https://doi.org/10.1111/hir.12585
  2. medRxiv. 2025 Jun 10. pii: 2025.06.09.25329285. [Epub ahead of print]
      We have developed a free, public web-based tool, Trials to Publications, https://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/TrialPubLinking/trial_pub_link_start.cgi , which employs a machine learning model to predict which publications are likely to present clinical outcome results from a given registered trial in ClinicalTrials.gov . The tool has reasonably high precision, yet in a recent study we found that when registry mentions are not explicitly listed in metadata, textual clues (in title, abstract or other metadata) could identify only roughly 1/3-1/2 of the publications with high confidence. This finding has led us to expand the scope of the tool, to search for explicit mentions of registry numbers that are located within the full-text of publications. We have now retrieved ClinicalTrials.gov registry number mentions (NCT numbers) from the full-text of 3 online biomedical article collections (open access PubMed Central, EuroPMC, and OpenAlex), as well as retrieving biomedical citations that are mentioned within the ClinicalTrials.gov registry itself. These methods greatly increase the recall of identifying linked publications, and should assist those carrying out evidence syntheses as well as those studying the meta-science of clinical trials.
    Highlights: Those conducting systematic reviews, other evidence syntheses, and meta-science analyses often need to examine published evidence arising from clinical trials. Finding publications linked to a given trial is a difficult manual process, but several automated tools have been developed. The Trials to Publications tool is the only free, public, currently maintained web-based tool that predicts publications linked to a given trial in ClinicalTrials.gov . A recent analysis indicated that the Trials to Publications tool has good precision but limited recall. In the present paper, we greatly enhanced the recall by identifying registry mentions in full-text of articles indexed in open access PubMed Central, EuroPMC and OpenAlex.The tool now has reasonably comprehensive coverage of registry mentions, both for identifying articles that present trial outcome results and for other types of articles that are linked to, or that discuss, the trials. This should greatly save effort during web searches of the literature.
    DOI:  https://doi.org/10.1101/2025.06.09.25329285
  3. Proc Natl Acad Sci U S A. 2025 Jul 08. 122(27): e2503051122
      This study compares the bibliographic and full-text coverage of 15 conventional and alternative discovery/access mechanisms: two multidisciplinary library databases (Scopus and the Web of Science Core Collection), five single-subject databases, the integrated library search (ILS) mechanism of Manhattan University, a scholarly search engine (Google Scholar), two web-based scholarly databases (Dimensions and OpenAlex), two academic social networks (Academia.edu and ResearchGate), and two pirate sites (Anna's Archive and Sci-Hub). The analysis is based on known-item searches for 875 target documents in chemistry, materials science, cardiology, public health, economics, education, and psychology. Overall, Google Scholar, OpenAlex, and the ILS are the most comprehensive sources of bibliographic records. Google Scholar's coverage rate is higher than that of all the Manhattan University databases combined, and Scopus-the most comprehensive multidisciplinary library database-has a lower bibliographic coverage rate than Google Scholar, both of the web-based scholarly databases, one of the two ASNs, and one of the two pirate sites. In terms of full-text coverage, the best multidisciplinary options are the ILS, Google Scholar, and the two pirate sites. Although several of the alternative discovery/access mechanisms are deficient in terms of their user interfaces, search capabilities, and metadata, they nonetheless provide excellent bibliographic and full-text coverage of the scholarly literature. In contrast, many single-subject library databases provide very incomplete coverage of their own subject areas. These findings have implications for scholars and students as well as system-wide implications for the use, development, and evaluation of information resources.
    Keywords:  access; discovery; open access; retrieval; scholarly communication
    DOI:  https://doi.org/10.1073/pnas.2503051122
  4. Med Ref Serv Q. 2025 Jul 02. 1-12
      Libraries with systematic review services rely on technology, often selected based on institutional subscriptions, for internal communication and data collection. Many libraries rely on manual data entry despite available no- or low-code software, like Microsoft Power Automate® or Zapier, for automating and optimizing team workflows. This case study describes how one library implemented Power Automate® flows to automate email reminders, support project management tasks, coordinate workflows across a large team, collect data, and facilitate assessment and reporting.
    Keywords:  Automation; library services; low-code; no-code; project management; statistics; systematic reviews
    DOI:  https://doi.org/10.1080/02763869.2025.2520222
  5. Health Info Libr J. 2025 Jul 03.
      This article is part of a research project aimed at leveraging environmental health literacy (EHL) to enhance public health in developing countries. EHL is an emerging concept that integrates elements from information literacy, health literacy, and environmental literacy. It equips individuals with a wide range of skills and competencies to evaluate and understand the relationship between their environment and their health, enabling them to make informed decisions. Based on a proposed four-dimensional conceptual framework-comprising accessing, understanding, appraising, and applying information-a tool called EHL-Q25 was developed for assessing EHL. This article focuses on how the proposed framework and the validated EHL-Q25 tool can be utilized to inform the provision of services and practices in health sciences libraries.
    Keywords:  Public health; evaluation; health literacy; librarians, health science
    DOI:  https://doi.org/10.1111/hir.12573
  6. JMIR Form Res. 2025 Jun 30.
       BACKGROUND: Background: Chemical ocular injuries are a major public health issue. They cause eye damage from harmful chemicals and can lead to severe vision loss or blindness if not treated promptly and effectively. Although medical knowledge has advanced, accessing reliable and understandable information on these injuries remains a challenge. This is due to unverified online content and complex terminology. Artificial Intelligence (AI) tools like ChatGPT provide a promising solution by simplifying medical information and making it more accessible to the general public.
    OBJECTIVE: Objective: This study aims to assess the use of ChatGPT in providing reliable, accurate, and accessible medical information on chemical ocular injuries. It evaluates the correctness, thematic accuracy, and coherence of ChatGPT's responses compared to established medical guidelines and explores its potential for patient education.
    METHODS: Methods: Nine questions were entered to ChatGPT regarding various aspects of chemical ocular injuries. These included the definition, prevalence, etiology, prevention, symptoms, diagnosis, treatment, follow-up, and complications. The responses provided by ChatGPT were compared to the ICD-9 and ICD-10 guidelines for chemical (alkali and acid) injuries of the conjunctiva and cornea. The evaluation focused on criteria such as correctness, thematic accuracy, coherence to assess the accuracy of ChatGPT's responses. The inputs were categorized into three distinct groups, and statistical analyses, including Flesch-Kincaid readability tests, ANOVA, and trend analysis, were conducted to assess their readability, complexity and trends.
    RESULTS: Results: The results showed that ChatGPT provided accurate and coherent responses for most questions about chemical ocular injuries, demonstrating thematic relevance. However, the responses sometimes overlooked critical clinical details or guideline-specific elements, such as emphasizing the urgency of care, using precise classification systems, and addressing detailed diagnostic or management protocols. While the answers were generally valid, they occasionally included less relevant or overly generalized information. This reduced their consistency with established medical guidelines. The average FRES was 33.84 ± 2.97, indicating a fairly challenging reading level, while the FKGL averaged 14.21 ± 0.97, suitable for readers with college-level proficiency. Passive voice was used in 7.22% ± 5.60% of sentences, indicating moderate reliance. Statistical analysis showed no significant differences in FRES (p = .385), FKGL (p = .555), or passive sentence usage (p = .601) across categories, as determined by one-way ANOVA. Readability remained relatively constant across the three categories, as determined by trend analysis.
    CONCLUSIONS: Conclusions: ChatGPT shows strong potential in providing accurate and relevant information about chemical ocular injuries. However, its language complexity may prevent accessibility for individuals with lower health literacy and sometimes miss critical aspects. Future improvements should focus on enhancing readability, increasing context-specific accuracy, and tailoring responses to person needs and literacy levels.
    CLINICALTRIAL: This is not RCT.
    DOI:  https://doi.org/10.2196/73642
  7. Digit Health. 2025 Jan-Dec;11:11 20552076251350760
       Objective: To compare the reliability and readability of responses from Generative Pre-trained Transformer versions 3.5 (GPT-3.5) and 4.0 (GPT-4.0) on traumatic brain injury (TBI) topics against Model Systems Knowledge Translation Center (MSKTC) fact sheets.
    Methods: This study analyzed responses from GPT-3.5 and GPT-4.0 for accuracy, comprehensiveness, and readability against MSKTC fact sheets, incorporating a correlation analysis between reliability and readability scores.
    Results: Findings showed an improvement in reliability from GPT-3.5 (mean score = 3.21) to GPT-4.0 (mean score = 3.63), indicating better accuracy and completeness in the latter. Despite advancements, responses generally remained accurate but not fully comprehensive. Readability comparisons found the MSKTC fact sheets were significantly more reader-friendly compared to responses from both artificial intelligence (AI) versions, with no strong correlation between reliability and readability.
    Conclusion: The study highlights progress in AI-generated information on TBI from GPT-3.5 to GPT-4.0 in terms of reliability. However, challenges persist in matching the readability of standard patient education materials, emphasizing the need for future AI developments to focus on enhancing understandability alongside accuracy.
    Keywords:  ChatGPT; Traumatic brain injury; artificial intelligence; readability; reliability
    DOI:  https://doi.org/10.1177/20552076251350760
  8. Sci Rep. 2025 Jul 02. 15(1): 22474
      Optimising healthcare is linked to broadening access to health literacy in Low- and Middle-Income Countries. The safe and responsible deployment of Large Language Models (LLMs) may provide accurate, reliable, and culturally relevant healthcare information. We aimed to assess the quality of outputs generated by LLMs addressing maternal health. We employed GPT-4, GPT-3.5, GPT-3.5 custom, Meditron-70b. Using mixed-methods, cross-sectional survey approach, specialists from Brazil, United States, and Pakistan assessed LLM-generated responses in their native languages to a set of three questions relating to maternal health. Evaluators assessed the answers in technical and non-technical scenarios. The LLMs' responses were evaluated regarding information quality, clarity, readability and adequacy. Of the 47 respondents, 85% were female, mean age of 50 years old, with a mean of 19 years of experience (volume of 110 assisted pregnancies monthly). Scores attributed to answers by GPT-3.5 and GPT-4 were consistently higher [Overall, GPT-3.5, 3.9 (3.8-4.1); GPT-4.0, 3.9 (3.8-4.1); Custom GPT-3.5, 2.7 (2.5-2.8); Meditron-70b, 3.5 (3.3-3.6); p = 0.000]. The responses garnered high scores for clarity (Q&A-1 3.5, Q&A-2 3.7, Q&A-3 3.8) and for quality of content (Q&A-1 3.2, Q&A-2 3.2, Q&A-3 3.7); however, they differed by language. The commonest limitation to quality was incomplete content. Readability analysis indicated that responses may require high educational level for comprehension. Gender bias was detected, as models referred to healthcare professionals as males. Overall, GPT-4 and GPT-3.5 outperformed all other models. These findings highlight the potential of artificial intelligence in improving access to high-quality maternal health information. Given the complex process of generating high-quality non-English databases, it is desirable to incorporate more accurate translation tools and resourceful architectures for contextualization and customisation.
    Keywords:  Evaluation; Large language models; Low- and Middle-Income countries; Maternal health education
    DOI:  https://doi.org/10.1038/s41598-025-03501-x
  9. World Allergy Organ J. 2025 Jul;18(7): 101071
       Background: The increasing use of artificial intelligence (AI) in healthcare, especially in delivering medical information, prompts concerns over the reliability and accuracy of AI-generated responses. This study evaluates the quality, reliability, and readability of ChatGPT-4 responses for chronic urticaria (CU) care, considering the potential implications of inaccurate medical information.
    Objective: The goal of the study was to assess the quality, reliability, and readability of ChatGPT-4 responses to inquiries on CU management in accordance with international guidelines, utilizing validated metrics to evaluate the effectiveness of ChatGPT-4 as a resource for medical information acquisition.
    Methods: Twenty-four questions were derived from the EAACI/GA2LEN/EuroGuiDerm/APAAACI recommendations and utilized as prompts for ChatGPT-4 to obtain responses in individual chats for each question. The inquiries were categorized into 3 groups: A.) Classification and Diagnosis, B.) Assessment and Monitoring, and C.) Treatment and Management Recommendations. The responses were separately evaluated by allergy specialists utilizing the DISCERN instrument for quality assessment, Journal of the American Medical Association (JAMA) benchmark criteria for reliability evaluation, and Flesch scores for readability analysis. The scores were further examined by median calculations and Intraclass Correlation Coefficient assessments.
    Results: Categories A and C exhibited insufficient reliability according to JAMA, with median scores of 1 and 0, respectively. Category B exhibited a low reliability score (median 2, interquartile range 2). The information quality from category C questions was satisfactory (median 51.5, IQR 12.5). All 3 groups exhibited confusing readability levels according to the Flesch assessment.
    Limitations: The study's limitations encompass the emphasis on CU, possible bias in question selection, the use of particular instruments such as DISCERN, JAMA, and Flesch, as well as reliance on expert opinion for assessment.
    Conclusion: ChatGPT-4 demonstrates potential for producing medical content; nonetheless, its reliability is shaky underscoring the necessity for caution and confirmation when employing AI-generated medical information, especially in the management of CU.
    Keywords:  Artificial intelligence; Chronic urticaria; Generative artificial intelligence
    DOI:  https://doi.org/10.1016/j.waojou.2025.101071
  10. Int J Gynaecol Obstet. 2025 Jul 01.
       OBJECTIVE: To evaluate the responses of large language models (LLMs) to prenatal screening questions for fetal chromosomal anomalies in terms of scientific accuracy, guideline adherence, depth of response, and clarity, as well as their potential roles in patient education and health communication.
    METHODS: Responses generated by ChatGPT-4o and Gemini Advanced 1.5 Pro to frequently asked questions (FAQs) on prenatal screening for fetal chromosomal anomalies were systematically compared. Expert reviewers assessed each reply using a Likert scale across four criteria: adherence to clinical guidelines, scientific accuracy, clarity, and depth of response. Readability scores were calculated with the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) formulas.
    RESULTS: LLMs were evaluated based on their responses to prenatal screening questions. ChatGPT-4o received higher proportions of favorable ratings ("good" or "excellent") across all evaluation criteria, including scientific accuracy, guideline adherence, clarity, and depth of response. The difference in average Global Quality Scale scores between ChatGPT and Gemini was statistically significant (3.87 vs. 3.70; P = 0.003). The mean FRE scores were 20.11 for ChatGPT-4o and 32.25 for Gemini Advanced 1.5 Pro, and the mean FKGL scores were 15.09 and 12.64, respectively. The differences in both FRE and FKGL scores were statistically significant (P = 0.001 and P = 0.002, respectively).
    CONCLUSION: LLMs like ChatGPT-4o and Gemini Advanced 1.5 Pro can provide highly accurate responses for prenatal screening for fetal chromosomal anomalies, but they should only be used for informational purposes. A healthcare professional should always be consulted before making any final decisions.
    Keywords:  ChatGPT‐4o; Google Gemini advanced; artificial intelligence in healthcare; patient education; prenatal screening
    DOI:  https://doi.org/10.1002/ijgo.70348
  11. BMC Med Educ. 2025 Jul 01. 25(1): 903
       BACKGROUND: Large language models (LLMs), such as ChatGPT-4o, Grok 3.0, Gemini Advanced 2.0 Pro, and DeepSeek, have been tested in many medical domains in recent years, ranging from clinical decision support systems to patient information processes and even some intraoperative scenarios. However, despite this widespread use, how LLMs perform as step-by-step guides in environments requiring sensory-motor interaction-such as direct cadaver dissection-has not yet been systematically evaluated. This gap is particularly pronounced in anatomically complex areas with low error tolerance, such as brachial plexus dissection. This study aimed to comparatively analyze the performance of four different large language models (LLMs) in terms of scientific quality, educational value, and readability of their responses to structured questions in a cadaver dissection environment.
    METHODS: A structured question set of 28 items on brachial plexus dissection was created. Four experienced anatomists blindly evaluated the responses from the models using the modified DISCERN scale (mDISCERN) and the Global Quality Score (GQS). Readability was assessed using Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GFI), and Coleman-Liau Index (CLI). Content validity was tested via the Content Validity Index (CVI), and inter-rater reliability was calculated using Intraclass Correlation Coefficient (ICC) and Cohen's Kappa.
    RESULTS: ChatGPT-4o and Grok 3.0 received the highest scores for scientific accuracy and guidance structure (p < 0.01). DeepSeek showed high readability but limited content depth, while Gemini performed moderately across all parameters. Readability metrics were significantly correlated with quality scores.
    CONCLUSION: This is one of the first studies to systematically examine how LLM-based systems perform in a training context with sensory challenges such as cadaver dissection. While LLMs cannot replace the ethical and educational value provided by real human donors, they may offer scalable, individualized support in settings with limited mentorship or cadaver availability. Our study not only aims to support anatomy education in resource-limited environments, but also serves as a foundational reference for future AI-assisted cadaveric studies and intraoperative decision-support models in surgical anatomy.
    Keywords:  Anatomical variation; Artificial intelligence(ai) in anatomy; Brachial plexus; Cadaveric dissection; Clinical decision support; Dissection guidance; Educational AI; Large language models (LLMs); Readability analysis; Surgical anatomy
    DOI:  https://doi.org/10.1186/s12909-025-07493-0
  12. Iowa Orthop J. 2025 ;45(1): 19-32
       Background: As online medical resources become more accessible, patients increasingly consult AI platforms like ChatGPT for health-related information. Our study assessed the accuracy and appropriateness of ChatGPT's responses to common questions about lateral epicondylitis, comparing them against OrthoInfo as a gold standard.
    Methods: Eight frequently asked questions about lateral epicondylitis from OrthoInfo were selected and presented to ChatGPT at both standard and sixth-grade reading levels. Responses were evaluated for accuracy and appropriateness using a five-point Likert scale, with scores of four or above deemed satisfactory. Evaluations were conducted by two fellowship-trained Shoulder and Elbow surgeons, two Hand surgeons, and one Orthopaedic Sports fellow. We utilized the Flesch-Kincaid test to assess readability, and responses were statistically analyzed using paired t-tests.
    Results: ChatGPT's responses at the sixth-grade level scored lower in accuracy (mean = 3.9 ± 0.87, p = 0.046) and appropriateness (mean = 3.7 ± 0.92, p = 0.045) compared to the standard level (accuracy = 4.7 ± 0.43, appropriateness = 4.7 ± 0.45). When compared with OrthoInfo, standard responses from ChatGPT showed significantly lower accuracy (mean difference = -0.275, p = 0.004) and appropriateness (mean difference = -0.475, p = 0.016). The Flesch-Kincaid grade level was significantly higher in the standard response group (mean = 14.06, p < 0.001) compared to both OrthoInfo (mean = 8.98) and the sixth-grade responses (mean = 8.48). No significance was noted between the Flesch-Kincaid grades of OrthoInfo and the sixth-grade responses.
    Conclusion: At a sixth-grade reading level, Chat-GPT provides oversimplified and less accurate information regarding lateral epicondylitis. Although standard level responses are more accurate, they still do not meet the reliability of OrthoInfo and exceed the recommended readability for patient education materials. While ChatGPT cannot be recommended as a sole information source, it may serve as a supplementary resource alongside professional medical consultation. Level of Evidence: IV.
    Keywords:  chatGPT; lateral epicondylitis; patient education; reading level
  13. J Cancer Educ. 2025 Jul 01.
      Effective communication is essential for promoting appropriate skin cancer screening for the public. This study compares the readability of online resources and ChatGPT-generated responses related to the topic of skin cancer screening. We analyzed 60 websites and responses to five questions from ChatGPT-4.0 using five readability metrics: the Flesch-Kincaid Reading Ease, Flesch-Kincaid Grade Level, SMOG Index, Gunning Fog Index, and Coleman-Liau Index. Results showed that both websites and ChatGPT responses exceeded the recommended sixth grade reading level for health-related information. No significant differences were found between the readability for university-hosted versus non-university-hosted websites. However, across all readability metrics, ChatGPT responses were significantly more difficult to read. These findings highlight the need to enhance the accessibility of health information by aligning content with recommended literacy levels. Future efforts should focus on developing patient-centered, publicly accessible materials and refining AI-generated content to improve public understanding and encourage proactive engagement in skin cancer screenings.
    Keywords:  Artificial intelligence; ChatGPT; Health literacy; Online health information; Patient education; Readability; Skin cancer screening
    DOI:  https://doi.org/10.1007/s13187-025-02683-2
  14. Cureus. 2025 May;17(5): e85014
      Background In the modern healthcare era, the internet serves as a major source of information for patients. However, prior studies have shown that online medical information frequently exceeds the recommended readability levels, limiting patient understanding. In the US, the average reading level is between seventh and eighth grade, while leading health organisations recommend that patient information not exceed a sixth-grade level. This study evaluates the readability and quality of publicly accessible online content related to osteochondral injuries of the knee. Methods A systematic search was conducted on Google (Google, Inc., Mountain View, CA), Bing (Microsoft® Corp., Redmond, WA), and Yahoo (Yahoo, Inc., New York, NY) using the terms "osteochondral defect knee" and "osteochondral injury knee." The top 30 uniform resource locators (URLs), for each search term, from each search engine were screened. Readability of the content was assessed using four standardised readability metrics (Gunning Fog Index, Flesch-Kincaid Grade, Flesch Reading Ease, and Simple Measure of Gobbledygook (SMOG) index), while quality was measured based on the Journal of the American Medical Association (JAMA) benchmark criteria. Results Forty-six unique webpages were included in the analysis. The mean Flesch-Kincaid Grade Level was 10.1 ± 3.4, the mean Gunning Fog Index grade was 11.5 ± 4.1, the mean Flesch Reading Ease score was 43.8 ± 12.8, and the mean SMOG grade was 8.8 ± 2.9. Only five webpages were at or below a sixth-grade reading level. The mean JAMA score was 1.43 ± 1.46 out of four. Conclusion This study assessed the readability and credibility of online health information. The majority of online resources related to osteochondral knee injuries are difficult to read and lack key quality indicators. Improving both readability and reliability is essential to support patient comprehension, informed decision-making, and promote better health literacy.
    Keywords:  health literacy; knee; osteochondral injuries; patient education; readability
    DOI:  https://doi.org/10.7759/cureus.85014
  15. Sci Rep. 2025 Jul 02. 15(1): 23336
      Fungi can cause infections in a number of body parts, including the skin, nails, hair and vulva-vaginal tissues. Topical antifungal drugs are the ones most frequently used to treat these infections. The purpose of this study is to assess topical antifungal medicine package insert readability using the Simple Measure of Gobbledygook (SMOG) and Flesch-Kincaid Grade Level (FKGL) readability formulas. The package inserts of 97 topical antifungal medications from the drugs.com website was examined for this investigation. The SMOG and FKGL formulae were used to assess readability. The analysis determined that the Flesch Reading Ease Score was 60.76 ± 6.08, and the total average Flesch-Kincaid Grade Level for the entire prospectus was 7.59 ± 0.91. 11.45 ± 1.09 was the mean SMOG score. The package inserts for the miconazole group showed somewhat lower FKGL and higher Flesch Reading Ease ratings. This suggests that this group is more readable than the others. The majority of the prospectuses could be read by students in the seventh grade. Additionally, variations across various drug formulations were discovered. The most readable forms were cream and gel. Topical antifungal medication leaflets have a generally moderate literacy level, suitable for students in the seventh grade. To improve patient comprehension, some sections-especially those on possible adverse effects-might need to be further clarified. Making package inserts more readable could help improve patient outcomes by encouraging adherence to treatment plans.
    Keywords:  Flesch-Kincaid; Package inserts; Readability; SMOG; Topical antifungal drugs
    DOI:  https://doi.org/10.1038/s41598-025-07234-9
  16. BMC Oral Health. 2025 Jul 03. 25(1): 1076
       BACKGROUND: This study evaluated the content, reliability and user engagement of YouTube videos on oral and dental health during pregnancy using a mixed-methods approach. The study focused solely on videos in the Turkish language.
    METHODS: A YouTube search was conducted using predefined keywords. After applying inclusion criteria, 189 videos were analyzed. Content evaluation was based on guidelines from the American Academy of Pediatric Dentistry (AAPD), American Dental Association (ADA), American Academy of Family Physicians (AAFP), American Academy of Pediatrics (AAP) and American College of Obstetricians and Gynecologists (ACOG). Video quality and reliability were assessed using the Global Quality Scale (GQS) and modified DISCERN criteria. User engagement metrics and thematic analysis of comments were analyzed using MAXQDA 24.
    RESULTS: Of the analyzed videos, 69.3% were uploaded by health-related sources, while 30.7% were from independent users. Videos from individual healthcare professionals had significantly higher engagement metrics compared to those from healthcare institutions (p < 0.001). Quality analysis showed that 15.3% of videos were high quality, while 56.1% were classified as low quality. The mean Information Reliability Score was 2.32 ± 1.04. Comment analysis showed that 44.4% of commenters were pregnant women, and the most frequently mentioned theme in qualitative analysis was "fear of harm to the baby" (39% of comments).
    CONCLUSIONS: The findings indicate that most YouTube videos on oral and dental health during pregnancy are of low quality and limited reliability. Increased involvement of healthcare professionals in producing evidence-based video content may enhance the quality of information available to pregnant women.
    Keywords:  Health information; Oral health; Patient education; Pregnancy; Social media analysis; YouTube
    DOI:  https://doi.org/10.1186/s12903-025-06498-x
  17. J Cancer Educ. 2025 Jul 01.
      TikTok is one of the most popular video-sharing social media platforms currently, with an increasing number of users posting and searching for cervical cancer-related videos. However, the quality and reliability of these videos have not been thoroughly evaluated. We conducted searches using #cervical cancer on TikTok, reviewed all included videos, and identified seven themes related to cervical cancer for analysis. We assessed the quality and reliability of the videos using the JAMA score and Modified DISCERN score. We also conducted intergroup comparative analysis of video quality and reliability for different thematic contents. A total of 100 Chinese-language videos related to cervical cancer were collected. Seven themes were established based on content, including cervical cancer patient experiences, cause of disease, symptoms, HPV-related topics, cancer treatment, laboratory and imaging examinations, and prognosis. The median JAMA score for all videos was only 1 (0, 3), and the median Modified DISCERN score was also only 1 (0, 4), indicating unsatisfactory results. When comparing video scores by different thematic contents, both scores showed statistically significant differences between groups (p < 0.001). Overall, the quality and reliability of Chinese-language cervical cancer videos on TikTok are low. Healthcare professionals should strive to improve video quality and reliability when publishing content, and patients should maintain a cautious attitude when searching for related content.
    Keywords:  Cervical cancer; Quality; Reliability; Social media; TikTok
    DOI:  https://doi.org/10.1007/s13187-025-02677-0
  18. Int J Impot Res. 2025 Jul 02.
      This study evaluates the accuracy and reliability of YouTube videos on retrograde ejaculation, a condition affecting male fertility and quality of life. A systematic search conducted on December 11, 2024, identified 97 relevant videos from an initial pool of 545. Videos were analyzed using the DISCERN tool and Global Quality Score to assess reliability and educational quality. Reliable videos, comprising 54.6% of the sample, were significantly more often uploaded by universities, professional organizations, or nonprofit physician groups when compared to non-reliable videos (P = 0.006). When comparing different uploader subgroups, videos from universities, professional organizations, or nonprofit physician groups demonstrated significantly higher DISCERN and Global Quality Scores (P = 0.006, P < 0.001, respectively). In contrast, based on the overall distribution of uploader types among the entire video sample, videos uploaded by individuals or for-profit entities were more frequently classified as non-reliable (P = 0.005). Notably, viewership metrics did not differ significantly between reliable and nonreliable videos (P = 0.552), highlighting the challenge of discerning accurate content based on popularity alone. The findings underscore the importance of promoting high-quality, evidence-based information from credible sources on platforms like YouTube. Addressing misinformation and enhancing the visibility of reliable content are critical to improving patient education and decision-making in managing retrograde ejaculation.
    DOI:  https://doi.org/10.1038/s41443-025-01124-4
  19. J Spinal Cord Med. 2025 Jul 01. 1-5
       OBJECTIVE: The use of social media for researching medical conditions is steadily increasing among both healthcare professionals and the general public. Among the most widely used platforms, YouTube hosts numerous videos concerning stem cell therapy for spinal cord injury. In this study, we aimed to evaluate the quality and reliability of such videos available on YouTube.
    METHODS: In August 2024, a search was conducted on the YouTube platform using the keywords "spinal cord injury, stem cell." A total of 153 videos were evaluated independently by two neurosurgeons using the JAMA benchmark criteria and the Global Quality Score (GQS).
    RESULTS: : The publication years of the videos ranged from 2008 to 2024, with a mean year of 2018.15 ± 4.21. The mean JAMA score was 2.32 ± 1.16 for the first evaluator and 2.35 ± 1.24 for the second evaluator. The mean GQS was 2.86 ± 1.12 for the first evaluator and 2.77 ± 1.16 for the second. The average interaction index of the videos was 1.65 ± 1.91 (range: 0-14.4), while the average view rate was 5.19 ± 13.10 (range: 0-94.2). A positive correlation was found between the interaction index and the number of likes, number of views, view rate, and video duration (P < 0.05). However, no correlation was observed between the interaction index and the evaluators' scores (P > 0.05). Similarly, no statistically significant correlation was identified between the view rate and the number of likes or the evaluators' scores (P > 0.05). When comparing the number of views, interaction indices, and view rates, these values were found to be higher in videos produced by non-healthcare professionals.
    CONCLUSION: : Although the increasing use of stem cell therapy in various fields has led to greater interest in related YouTube content among patients and their relatives, the platform appears to be insufficient in providing and disseminating reliable medical information in this context.
    Keywords:  Misinformation; Patient education; Spinal cord injury; Stem cell therapy; YouTube
    DOI:  https://doi.org/10.1080/10790268.2025.2524224
  20. J Hum Nutr Diet. 2025 Aug;38(4): e70088
       PURPOSE: This study aimed to evaluate YouTube videos providing nutritional recommendations for irritable bowel syndrome (IBS) in terms of validity, quality, accuracy and reliability.
    METHODS: In December 2023, we searched for relevant videos on YouTube using three search terms related to IBS in Turkish. Two independent researchers analysed the content of 64 videos that met the inclusion criteria. Reliability and quality were determined using the m-DISCERN criteria, the Global Quality Scale (GQS), the Journal of American Medical Association (JAMA) system, the Video Information and Quality Index (VIQI), and the IBS Nutition Scoring System (INSS).
    RESULTS: The majority of these videos (%67.2) are produced by gastroenterologists, 15.6% by dietitians, and 17.2% by other individuals. The number of views and likes on videos by other individuals was higher compared to the videos of gastroenterologists (p < 0.05). In the comparison across the three groups, no statistically significant differences were found in the mean INSS (p = 0.287) and JAMA scores (p = 0.783). However, significant differences were observed in the mean interaction index (p = 0.029), Video Power Index (VPI) (p = 0.006), m-DISCERN (p < 0.001), GQS (p = 0.002), and VIQI (p < 0.001) scores. GQS scores demonstrated strong positive correlations with both INSS and VIQI scores (r = 0.6528, r = 0.6174, respectively), while showing a moderate correlation with m-DISCERN scores (r = 0.531, p < 0.001).
    CONCLUSION: Our study shows that users prioritise popularity over reliability when seeking nutritional information. Health professionals should create engaging content to ensure accurate information stands out online.
    Keywords:  Nutrition Scoring System; YouTube; accuracy; dietary advice; irritable bowel syndrome
    DOI:  https://doi.org/10.1111/jhn.70088
  21. Sci Rep. 2025 Jul 01. 15(1): 21671
      This study aims to evaluate the informational quality of oral cancer-related videos on YouTube and Bilibili. A total of 300 oral videos that met the inclusion and exclusion criteria were selected for analysis. The selection comprised 150 videos from 111 uploaders on YouTube and 150 videos from 134 uploaders on Bilibili. YouTube videos received a greater number of views and likes, while there was no significant difference in average likes per 30 days or comments between the two platforms. The majority of YouTube uploaders were hospitals/non-profit organizations (66.7%) and for-profit companies (17.1%), while Bilibili uploaders were mainly self-media (55.2%) and doctors (29.1%). YouTube videos covered a broader range of topics compared to Bilibili videos. Though solo narration was the most prevalent video style across both platforms, YouTube exhibited a higher preference for TV shows/documentaries (31.3%). Video quality was assessed using the four tools: Though mDISCERN (modified DISCERN) and PEMAT-Actionability (Patient Education Materials Assessment Tool) were similar across platforms, YouTube videos scored higher on PEMAT-Understandability, VIQI (Video Information and Quality Index), and GQS (Global Quality Score) in comparison to Bilibili videos. Videos produced by health professionals were considered more reliable. Spearman correlation analysis revealed no strong relationships between video quality and audience interaction. In conclusion, YouTube videos exhibited higher audience engagement and video quality, yet improvements are needed on both platforms. In order to promote high-quality health information, it is essential to encourage the development of more professional content creators and to optimize platform algorithms.
    Keywords:  Information quality; Oral cancer; Patient education; Public education; Public health; Social media
    DOI:  https://doi.org/10.1038/s41598-025-02898-9
  22. Clin Investig Arterioscler. 2025 Jun 30. pii: S0214-9168(25)00072-5. [Epub ahead of print] 500829
       BACKGROUND AND AIMS: The internet is a major source of health information, with platforms like YouTube® and Facebook® widely used by patients to learn about diseases and treatments. However, the reliability of content on PCSK9 inhibitors (PCSK9i), a therapy for dyslipidemia, remains unknown. This study aims to evaluate the general characteristics, user engagement metrics, reliability, comprehensiveness, and quality of English- and Spanish-language PCSK9i videos on YouTube® and Facebook®.
    MATERIALS AND METHODS: Analytical observational study. Paired evaluations were conducted using validated tools: the modified DISCERN (mDISCERN) for reliability, a modified content score for comprehensiveness, and the Global Quality Score (GQS) for quality. Comparisons were made based on platform, source, and language.
    RESULTS: A total of 203 videos were analyzed. YouTube® videos had significantly higher median views, longer duration, and greater engagement than Facebook® videos, while Facebook® contained a higher proportion of non-valuable content (19.6% vs. 3.2%, p<0.001). Most videos targeted patients and healthcare professionals, with professional organizations and independent users as the primary contributors. YouTube® videos were more frequently rated as "good or better" based on mDISCERN, content score, and GQS. Notably, for-profit organizations achieved the highest content scores and GQS values. Inter-rater reliability was excellent across all scoring tools, with kappa coefficients exceeding 0.89.
    CONCLUSIONS: YouTube® videos on PCSK9i had higher engagement and reliability than those on Facebook®. For-profit organizations produced the most reliable and exhaustive videos. However, overall quality remains suboptimal, underscoring the need for greater oversight and effective strategies to ensure the dissemination of accurate, high-quality information.
    Keywords:  Comunicación en salud; Contenido multimedia; Difusión de información; Health communication; Information dissemination; Inhibidores de PCSK9; Multimedia; PCSK9 inhibitors; Redes sociales; Social media
    DOI:  https://doi.org/10.1016/j.arteri.2025.500829
  23. Ear Nose Throat J. 2025 Jun 28. 1455613251353407
       OBJECTIVE: To explore public perceptions of thyroidectomy on TikTok by analyzing post-content, creator type, postoperative concerns, content accuracy, and understandability.
    STUDY DESIGN: Mixed-methods study utilizing qualitative and quantitative analyses.
    SETTING: The TikTok social media platform.
    METHODS: In October 2023, the top 100 public TikTok videos were collected using the search terms "thyroidectomy," "thyroid removal," and "thyroid surgery." Videos were analyzed for engagement metrics (likes, comments, shares, views, length) and scored using the Video Power Index (VPI). Creator type (patient, physician, non-MD/DO healthcare provider, or non-medical), content themes, and tone were categorized. Content accuracy was evaluated based on American Thyroid Association (ATA) guidelines. Patient complaints and postoperative symptoms were noted. Videos offering education or medical advice were assessed for understandability and actionability using the Patient Education Materials Assessment Tool (PEMAT).
    RESULTS: Most videos (63%) were created by patients; 27% by physicians, 8% by non-MD/DO providers, and 2% by non-medical creators. Negative portrayals of thyroidectomy (39%) were exclusively from patient accounts. Common complaints included neck pain (19%), low energy (9%), hormone imbalance (7%), weight gain (7%), dysphagia (7%), and cosmetic concerns (7%). The most common themes were post-op experiences (36%) and medical education (36%). Physician-created content was 100% accurate per ATA guidelines, while non-medical accuracy was 65%. PEMAT scores from MD/DO videos showed 78.69% understandability and 26.61% actionability. Patient videos had the highest VPI (0.93 and 0.79).
    CONCLUSION: TikTok content on thyroidectomy is largely patient-driven, often reflecting negative postoperative experiences. Physicians should increase social media engagement with accurate and actionable content to improve patient education and address prevalent misconceptions.
    Keywords:  health education; internet/electronic interventions; social media analysis; thyroid surgery; thyroidectomy
    DOI:  https://doi.org/10.1177/01455613251353407
  24. Sci Rep. 2025 Jul 02. 15(1): 22654
      Ankle sprains commonly occur in various indoor and outdoor sports, and if not handled properly, they can lead to more serious complications and worsen the injury. With the popularity of the internet, short videos have become an important way for people to obtain ankle-related knowledge. However, there is a lack of relevant research to evaluate the quality of these educational videos about ankle sprain.On the Douyin and Bilibili platforms, enter the keyword "ankle sprain" and collect data based on the number of likes and views after sorting.First, summarize the characteristics of the video and give an overall rating. Next, conduct a correlation analysis of video features, completeness, GQS, and DISCERN ratings. Finally, group the videos based on different short video platforms and uploader identities, and compare the quality of videos from different groups.Evaluate the completeness, quality and reliability of these educational videos, and perform statistical analysis on the data.This study analyzed 168 educational short videos related to ankle sprains from two platforms, Douyin and Bilibili. Overall, these videos had low scores in terms of completeness, quality, and reliability. Correlation analysis between video features and ratings revealed that there was not a strong relationship between them. When divided into two groups based on different platforms, the videos on Douyin had higher scores. Further analysis was conducted by dividing the uploaders into five groups, and it was found that videos uploaded by healthcare professionals (orthopedic and non-orthopedic physicians) had higher scores.The overall quality of short videos related to ankle sprains needs improvement. Short videos from Douyin uploaded by medical professionals(orthopedic and non-orthopedic physicians) deserve more attention from people.
    Keywords:  Ankle sprains; Bilibili; Douyin; Information quality; Short video
    DOI:  https://doi.org/10.1038/s41598-025-07656-5
  25. BMC Public Health. 2025 Jul 02. 25(1): 2245
       BACKGROUND: Esophageal cancer typically lacks specific early symptoms, leading to late-stage diagnosis and poor prognosis, with an overall low 5-year survival rate. However, early detection and timely intervention can significantly improve the 5-year survival rate, underscoring the importance of prevention, screening, and early intervention. Short video platforms are increasingly utilized for health communication, offering opportunities to disseminate medical knowledge. However, the quality and reliability of health-related content, particularly for diseases like esophageal cancer, remain under explored.
    OBJECTIVE: This study aimed to systematically evaluate the quality, reliability, completeness, and engagement of esophageal cancer-related videos on three popular short video platform: Bilibili, TikTok, and Kwai, to identify platform-specific strengths and limitations in disseminating health information.
    METHODS: A total of 311 esophageal cancer-related videos were analyzed. Video assessment was assessed using 4 standardized scoring framework including General Quality Scores (GQS) for general quality, the DISCERN for reliability, Completeness Score (CS) for comprehensive information, Engagement Score (ES) for understandability and entertainment value of the video. Video features (source, category, content), user's behavior (likes, shares, comments) were also collected. Cross-platform comparisons were conducted to identify disparities in content quality and user interaction.
    RESULTS: This study analyzed 311 esophageal cancer-related videos on Bilibili, TikTok, and Kwai. Video quality varied significantly across platforms, with Bilibili showing the highest DISCERN (5.46), GQS (2.97) and CS (3.64), while TikTok videos achieved the highest ES (2.88) and engagement metrics (e.g., likes and collections, p < 0.001). Kwai videos had the lowest scores across all measures. Content focused primarily on "symptoms" and "treatment," with Bilibili offering more comprehensive coverage. Correlation analysis revealed a positive association between video quality and engagement on Bilibili but a negative association on TikTok (e.g., GQS and likes, r=-0.251, P = 0.009).
    CONCLUSIONS: The quality of esophageal cancer-related videos across Bilibili, TikTok, and Kwai is suboptimal, with notable quality disparities among the platforms. Users on platforms other than Bilibili show limited ability to identify or prefer higher-quality content. This study underscores the potential of short-video platforms for esophageal cancer public health education, but highlights the need for improvements in content quality, ethical standards, and platform governance to address health equity concerns.
    Keywords:  Bilibili; DISCERN; Esophageal cancer; GQS; Information quality; Short video; Social media; TikTok
    DOI:  https://doi.org/10.1186/s12889-025-23475-9
  26. Perspect Sex Reprod Health. 2025 Jun 30.
       INTRODUCTION: Social media platforms have rapidly become key sources of contraceptive health information, shaping the beliefs and behaviors of individuals of reproductive age. Yet, it has become increasingly difficult to distinguish accurate content from misleading information, potentially leading to higher unintended pregnancy rates. Given the limited insights into the quality and reliability of contraceptive information on TikTok, this cross-sectional study aimed to systematically evaluate popular TikTok content on contraception created by various users to identify and analyze misinformation.
    METHOD: Between August and September 2023, we analyzed 100 videos from the top five hashtags related to contraception methods (#birthcontrol, #contraception, #thepill, #naturalbirthcontrol, and #cycletracking) to assess the characteristics of the health information presented and their quality, using the DISCERN tool.
    RESULTS: The TikTok videos collectively received 4.85 billion views. Only 10% were created by medical professionals. Overall, the content showed poor reliability and quality, indicating a prominent presence of contraceptive health misinformation. Furthermore, there was a concerning trend favoring natural contraceptive methods over hormonal options, often without appropriate risk disclosures, accompanied by a growing distrust in health professionals.
    DISCUSSION: The rise of contraceptive misinformation on social media is re-shaping patient-provider relationships and impacting contraceptive beliefs. TikTok offers an excellent public health opportunity to disseminate accurate contraceptive information accessible to all individuals, regardless of their background or resources. To address the observed distrust in health professionals, it is essential to improve contraceptive care quality and promote shared decision-making, which would likely increase satisfaction with contraceptive choices and mitigate negative narratives online.
    DOI:  https://doi.org/10.1111/psrh.70025
  27. Digit Health. 2025 Jan-Dec;11:11 20552076251351383
       Objective: The COVID-19 pandemic has deeply impacted mental health, especially among young people, driven by extended social isolation, routine disruptions, and uncertainties about health and the future. While rising levels of anxiety and depression in this group are well-documented, little is known about their online information-seeking patterns during this prolonged crisis. Exploring these patterns is vital for understanding how individuals navigate mental health challenges and seek support in times of uncertainty.
    Method: This cross-sectional study investigates the online mental health information-seeking behaviors of young people in China during the COVID-19 pandemic. Using content analysis, we examined 1211 questions and 2303 responses from a popular Chinese social Q&A platform, Zhihu. Among these, 691 questions were identified as originating from young people, with the remainder attributed to adults. The analysis focused on the types of information sought, the effectiveness of responses, and the responsibility frameworks conveyed. By comparing the information-seeking behaviors of young people to those of adults, the study aims to uncover the unique needs of younger individuals.
    Findings: First, young people primarily sought information about social adaptation, whereas adults demonstrated greater interest in diagnosis-related queries. Second, while young people's questions received more responses on average, nearly half remained unanswered for over four weeks, reflecting a lack of timely support. Finally, the qualitative nature of responses presented limitations, particularly for youth: they received more responses emphasizing individual responsibility and fewer recovery stories compared to adults, limiting exposure to systemic perspectives and hope-inspiring recovery pathways.
    Conclusions: This study highlights the unique mental health information-seeking behaviors of young people in China and the potential of social Q&A platforms, offering valuable insights to help health professionals and policymakers allocate resources effectively and design targeted interventions to support this demographic during the pandemic.
    Keywords:  Mental health; information-seeking behaviors; social Q&A platforms; targeted interventions; young people
    DOI:  https://doi.org/10.1177/20552076251351383
  28. BMC Pregnancy Childbirth. 2025 Jul 02. 25(1): 697
       INTRODUCTION: The internet and social media have become integral parts of people's lives, with many individuals using them to fulfill their information needs. Notably, around 90% of pregnant women worldwide use the internet to seek pregnancy-related information and often make decisions based on what they read. This study aimed to: 1) determine the prevalence of internet and social media use for pregnancy-related information seeking among Lebanese women; 2) assess their knowledge of basic pregnancy information; and 3) explore their attitudes towards information obtained from media sources.
    METHODS: A multi-centric cross-sectional study was conducted from December 1, 2023, to March 15, 2024, across 25 primary health care centers throughout Lebanon. Pregnant women aged between 18 and 45 years completed a questionnaire primarily through face-to-face meetings and additionally online using Google Forms. The questionnaire was distributed in two ways: first, by approaching pregnant women at healthcare centers; and second, by contacting pregnant women via phone. The questionnaire, written in Arabic, included general questions about sociodemographic variables and social media use, as well as specific questions regarding knowledge and information-seeking behavior related to pregnancy.
    RESULTS: A total of 377 pregnant women participated in the study, 74.3% (280) of whom have previous children. Additionally, 73.5% (277) of the participants used the internet to obtain medical information related to pregnancy, with Google being the most utilized platform. The most commonly searched topics were food and nutritional supplements recommended during pregnancy, drugs and practices to be avoided, and common pregnancy symptoms. Non-mothers were more likely than mothers to follow a medical influencer (p = 0.002) and use the internet for pregnancy-related information (p = 0.01). In a univariate logistic regression analysis, not having had a previous abortion (p = 0.04, OR = 0.61), not experiencing financial difficulties in visiting a doctor (p = 0.02, OR = 0.61), and using the internet for pregnancy-related information (p < 0.0001, OR = 2.55) were predictors of good knowledge about pregnancy information.
    CONCLUSION: The internet provides easy access to information for pregnant women. Non-mothers are more likely than mothers to use the internet for medical information. Using the internet helps pregnant women gain knowledge about pregnancy-related information.
    Keywords:  Information; Internet; Lebanon; Online; Pregnancy
    DOI:  https://doi.org/10.1186/s12884-025-07810-x
  29. Health Commun. 2025 Jun 30. 1-16
      Information seeking and avoidance are important coping strategies for individuals dealing with mental and physical health conditions. To explore the mechanisms behind these behaviors, this study applies the Planned Risk Information Seeking Model (PRISM). An online survey of 1,327 individuals with mental or physical health conditions found more similarities than differences in the predictors of both seeking and avoidance behaviors across these groups, supporting the cross-contextual validity of PRISM. Specifically, attitudes toward information behaviors and subjective norms were identified as overarching predictors. By focusing on mental health conditions, which have been studied less frequently, this research shows that mental health information seeking is influenced by one's attitudes, fears, and hopes, while avoidance is more distinctively shaped by attitudes and subjective norms related to avoidance.
    DOI:  https://doi.org/10.1080/10410236.2025.2521010
  30. Cureus. 2025 Jun;17(6): e85277
      Introduction Artificial intelligence (AI) chatbots, including ChatGPT and DeepSeek, are becoming popular tools for generating patient education materials for chronic diseases. AI chatbots are useful as supplements to traditional counseling but lack the empathy and intuition of healthcare professionals, making them most effective when used alongside human therapists. The objective of the study is to compare ChatGPT-4o and DeepSeek V3-generated patient educational guides for epilepsy, heart failure, chronic obstructive pulmonary disease (COPD), and chronic kidney disease (CKD). Methodology In this cross-sectional study, the standardized prompts for each disease were entered into ChatGPT and DeepSeek. The resultant texts were evaluated for readability, originality, quality, and suitability. Unpaired t-tests were performed to analyze statistical differences between tools. Results Both AI tools created patient education materials that had similar word and sentence counts, readability scores, reliability, and suitability in all areas, except for the similarity percentage, which was much higher in ChatGPT outputs (p=0.049). The readability scores indicated that both tools produced content that was above the recommended level for patient materials. Both tools resulted in high similarity indices that exceeded accepted academic thresholds. Reliability scores were moderate, and while understandability was high, actionability scores were suboptimal for both models. Conclusion The patient education materials provided by ChatGPT and DeepSeek are similar in nature, but neither satisfies recommended standards for readability, originality, or actionability. Both still need additional fine-tuning and human oversight to enhance accessibility, reliability, and practical utility in clinical settings.
    Keywords:  artificial intelligence in medicine; chatgpt; ckd; copd; deepseek; epilepsy; heart failure; patient education guides
    DOI:  https://doi.org/10.7759/cureus.85277
  31. Comput Struct Biotechnol J. 2025 ;27 2626-2637
      Darling is a web application that employs literature mining to detect disease-related biomedical entity associations. Darling can detect sentence-based cooccurrences of biomedical entities such as genes, proteins, chemicals, functions, tissues, diseases, environments, and phenotypes from biomedical literature found in six disease-centric databases. In this version, we deploy additional query channels focusing on COVID-19, GWAS studies, cardiovascular, neurodegenerative, and cancer diseases. Compared to its predecessor, users now have extended query options including searches with PubMed identifiers, disease records, entity names, titles, single nucleotide polymorphisms, or the Entrez syntax. Furthermore, after applying named entity recognition, one can retrieve and mine the relevant literature from recognized terms for a free input text. Term associations are captured in customizable networks which can be further filtered by either term or co-occurrence frequency and visualized in 2D as weighted graphs or in 3D as multi-layered networks. The fetched terms are organized in searchable tables and clustered annotated documents. The reported genes can be further analyzed for functional enrichment using external applications called from within Darling. The Darling databases, including terms and their associations, are updated annually. Darling is available at: https://www.darling-miner.org/.
    Keywords:  Co-occurrence analysis; Literature mining; Named entity recognition; Network analysis; Text mining
    DOI:  https://doi.org/10.1016/j.csbj.2025.06.025