bims-librar Biomed News
on Biomedical librarianship
Issue of 2025–01–26
twenty-six papers selected by
Thomas Krichel, Open Library Society



  1. Med Ref Serv Q. 2025 Jan 20. 1-21
      The weeding project of the George F. Smith Library Reference Collection was undertaken due to anticipated space reconfiguration. With no place to relocate the reference books, the librarians eliminated the reference books from the area selected for redesign by discarding material or interfiling it with the circulating collection. Only a small portion of the "last copy monographs" collection was selected for retention and preservation. This case study presents an analysis of a one-time comprehensive project to free up space, dispose of obsolete material, and demonstrate what went behind the decisions to discard, interfile, or preserve the "last institutional copies" of monographs.
    Keywords:  Heath sciences; last copy; reference collection; weeding
    DOI:  https://doi.org/10.1080/02763869.2025.2453718
  2. JMIR Hum Factors. 2025 Jan 23. 12 e56941
       BACKGROUND: The internet is a key source of health information, but the quality of content from popular search engines varies, posing challenges for users-especially those with low health or digital health literacy. To address this, the "tala-med" search engine was developed in 2020 to provide access to high-quality, evidence-based content. It prioritizes German health websites based on trustworthiness, recency, user-friendliness, and comprehensibility, offering category-based filters while ensuring privacy by avoiding data collection and advertisements.
    OBJECTIVE: This study aims to evaluate the acceptance and usability of this independent, noncommercial search engine from the users' perspectives and their actual use of the search engine.
    METHODS: For the questionnaire study, a cross-sectional study design was used. In total, 802 participants were recruited through a web-based panel and were asked to interact with the new search engine before completing a web-based questionnaire. Descriptive statistics and multiple regression analyses were used to assess participants' acceptance and usability ratings, as well as predictors of acceptance. Furthermore, from October 2020 to June 2021, we used the open-source web analytics platform Matomo to collect behavior-tracking data from consenting users of the search engine.
    RESULTS: The study indicated positive findings on the acceptance and usability of the search engine, with more than half of the participants willing to reuse (465/802, 58%) and recommend it (507/802, 63.2%). Of the 802 users, 747 (93.1%) valued the absence of advertising. Furthermore, 92.3% (518/561), 93.9% (553/589), 94.7% (567/599), and 96.5% (600/622) of those users who used the filters agreed at least partially that the filter functions were helpful in finding trustworthy, recent, user-friendly, or comprehensible results. Participants criticized some of the search results regarding the selection of domains and shared ideas for potential improvements (eg, for a clearer design). Regression analyses showed that the search engine was especially well accepted among older users, frequent internet users, and those with lower educational levels, indicating an effective targeting of segments of the population with lower health literacy and digital health literacy. Tracking data analysis revealed 1631 sessions, comprising 3090 searches across 1984 unique terms. Users performed 1.64 (SD 1.31) searches per visit on average. They prioritized the search terms "corona," "back pain," and "cough." Filter changes were common, especially for recency and trustworthiness, reflecting the importance that users placed on these criteria.
    CONCLUSIONS: User questionnaires and behavior tracking showed the platform was well received, particularly by older and less educated users, especially for its advertisement-free design and filtering system. While feedback highlighted areas for improvement in design and filter functionality, the search engine's focus on transparency, evidence-based content, and user privacy shows promise in addressing health literacy and navigational needs. Future updates and research will further refine its effectiveness and impact on promoting access to quality health information.
    Keywords:  Germany; digital health literacy; evidence-based content; health information; health literacy; information-seeking behavior; medical information; navigational needs; search engine; user behavior
    DOI:  https://doi.org/10.2196/56941
  3. Sleep Med. 2025 Jan 08. pii: S1389-9457(24)00596-3. [Epub ahead of print]127 100-119
      Systematic reviews and meta-analyses are increasingly common in sleep research, although the methodological quality level has been a matter of concern. Efforts towards methodological standardization are needed to ensure the reliability of sleep-related systematic reviews. The development of search strategies is a critical step in a systematic review, which often lead to methodological biases. Standardized search filters have been used to facilitate the development of search strategies. However, such filters have not been developed for sleep medicine. The current study aimed at developing a list of PubMed search filters related to sleep medicine, including specific search strategies for different sleep disorders and sleep conditions. First, a list of sleep disorders and conditions was created for which search filters would be developed. This included most conditions listed in the International Classification of Sleep Disorders - 3rd edition. Additional search filters were developed for proposed disorders not recognized as independent clinical entities, and for other sleep-related conditions. All search strategies were designed specifically for PubMed, by combining relevant MeSH terms and free terms. Nine fully independent and unrelated MeSH terms related to sleep were identified. In total, 91 search filters were developed, related to 71 different sleep-related conditions. With the current work, we aimed to provide a list of reliable search filters organized to cover the field in a broad manner, therefore being useful for different types of systematic reviews within sleep medicine, ranging from narrow-focused meta-analyses to broader scoping reviews, mapping reviews, and meta-epidemiological studies.
    Keywords:  Epidemiology; Evidence-based medicine; Meta-epidemiological studies; Meta-research; Search strategies; Search strings
    DOI:  https://doi.org/10.1016/j.sleep.2024.12.032
  4. F1000Res. 2023 ;12 1140
       Background: The demand for online education promotion platforms has increased. In addition, the digital library system is one of the many systems that support teaching and learning. However, most digital library systems store books in the form of libraries that were developed or purchased exclusively by the library, without connecting data with different agencies in the same system.
    Methods: A hybrid recommender system model for digital libraries, developed from multiple online publishers, has created a prototype digital library system that connects various important knowledge sources from multiple digital libraries and online publishers to create an index and recommend e-books. The developed system utilizes an API-based linking process to connect various important sources of knowledge from multiple data sources such as e-books on education from educational institutions, e-books from government agencies, and e-books from religious organizations are stored separately. Then, a hybrid recommender system suitable for users was developed using Collaborative Filtering (CF) model together with Content-Based Filtering. This research proposed the hybrid recommender system model, which took into account the factors of book category, reading habits of users, and sources of information. The evaluation of the experiments involved soliciting feedback from system users and comparing the results with conventional recommendation methods.
    Results: A comparison of NDCG scores, and Precision scores were conducted for Hybrid Score 50:50, Hybrid Score 20:80, Hybrid Score 80:20, CF-score and CB-score. The experimental result was found that the Hybrid Score 80:20 method had the highest average NDCG score.
    Conclusions: Using a hybrid recommender system model that combines 80% Collaborative Filtering, and 20% Content-Based Filtering can improve the recommender method, leading to better referral efficiency and greater overall efficiency compared to traditional approaches.
    Keywords:  Recommender systems; collaborative filtering; content-based filtering; digital library; hybrid recommender systems; multiple database; user profile
    DOI:  https://doi.org/10.12688/f1000research.133013.3
  5. Int Orthop. 2025 Jan 24.
       PURPOSE: This study aimed to assess the presence of spin in abstracts of systematic reviews and meta-analyses comparing biceps tenodesis and tenotomy outcomes and to explore associations between spin and specific study characteristics.
    METHODS: Using Web of Science and PubMed databases, systematic reviews and meta-analyses comparing outcomes of biceps tenodesis and tenotomy were identified. Abstracts were evaluated for the nine most severe types of spin as described by Yavchitz et al. and appraised using the AMSTAR 2 (A MeaSurement Tool to Assess systematic Reviews). Study characteristics were extracted, including adherence to PRISMA guidelines,funding status, and impact metrics such as journal impact factor, total number of citations, and average annual citations.
    RESULTS: A total of 16 studies were included, with spin detected in 81.3% of the abstracts. Type three spin was the most frequent (56.3%), followed by types six (43.8%), five (37.5%), nine (25.0%), two (12.5%), and four (6.3%). Spin types one, seven, and eight were not observed. AMSTAR 2 appraised 75% of the studies as 'low' quality, and 25% as 'critically low' quality. All studies had at least one critical flaw, with item 15 (investigation of publication bias) being the most frequent (93.8%). A strong positive correlation was found between AMSTAR 2 scores and citation counts (r = 0.821, p < 0.001). Studies with a higher number of spin incidents were significantly more likely to have an associated letter to the editor (p = 0.0043).
    CONCLUSION: Severe types of spin were prevalent in the abstracts of systematic reviews and meta-analyses comparing biceps tenodesis and tenotomy. Data analysis suggests that abstracts with a higher incidence of spin tend to attract more scrutiny from the academic community. These findings highlight the need to enhance reporting standards.
    Keywords:  Biceps tenodesis; Biceps tenotomy; LHBT; Meta-analysis; Spin; Systematic review
    DOI:  https://doi.org/10.1007/s00264-025-06414-6
  6. BMC Oral Health. 2025 Jan 17. 25(1): 86
       BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) like ChatGPT have transformed information retrieval, including in healthcare. ChatGPT, trained on diverse datasets, can provide medical advice but faces ethical and accuracy concerns. This study evaluates the accuracy of ChatGPT-3.5's answers to frequently asked questions about oral cancer, a condition where early diagnosis is crucial for improving patient outcomes.
    METHODS: A total of 20 questions were asked to ChatGPT-3.5, selected from Google Trends and questions asked by patients in the clinic. The responses provided by ChatGPT were evaluated for accuracy by medical oncologists and oral and maxillofacial radiologists. Inter-rater agreement was assessed using Fleiss's and Cohen kappa tests. The scores given by the specialties were compared with the Mann-Whitney U test. The references provided by ChatGPT-3.5 were evaluated for authenticity.
    RESULTS: Of the 80 responses from 20 questions, 41 (51.25%) were rated as very good, 37 (46.25%) as good, 2 (2.50%) as acceptable. There was no significant difference between oral and maxillofacial radiologists and medical oncologists in all 20 questions. Of the 81 references to ChatGPT-3.5 answers, only 13 were scientific articles, 10 were fake, and the remaining references were data from websites.
    CONCLUSION: ChatGPT provided reliable information about oral cancer and did not provide incorrect information and suggestions. However, all information provided by ChatGPT is not based on real references.
    Keywords:  Accuracy; Artifical intelligence; ChatGPT; Oral cancer
    DOI:  https://doi.org/10.1186/s12903-025-05479-4
  7. Med Teach. 2025 Jan 20. 1-8
       PURPOSE: Our study aimed to: i) Assess the readability of textbook explanations using established indexes; ii) Compare these with GPT-4's default explanations, ensuring similar word counts for direct comparisons; iii) Evaluate GPT-4's adaptability by simplifying high-complexity explanations; iv) Determine the reliability of GPT-3.5 and GPT-4 in providing accurate answers.
    MATERIAL AND METHODS: We utilized a textbook designed for ABPMR certification. Our analysis covered 50 multiple-choice questions, each with a detailed explanation, focusing on non-traumatic spinal cord injury (NTSCI).
    RESULTS: Our analysis revealed statistically significant differences in readability scores, with the textbook achieving 14.5 (SD = 2.5) compared to GPT-4's 17.3 (SD = 1.9), indicating that GPT-4's explanations are generally more complex (p < 0.001). Using the Flesch Reading Ease Score, 86% of GPT-4's explanations fell into the 'Very difficult' category, significantly higher than the textbook's 58% (p = 0.006). GPT-4 successfully demonstrated adaptability by reducing the mean readability score of the top-nine most complex explanations, maintaining the word count. Regarding reliability, GPT-3.5 and GPT-4 scored 84% and 96% respectively, with GPT-4 outperforming GPT-3.5 (p = 0.046).
    CONCLUSIONS: Our results confirmed GPT-4's potential in medical education by providing highly accurate yet often complex explanations for NTSCI, which were successfully simplified without losing accuracy.
    Keywords:  ChatGPT; Chatbot; readability; reliability
    DOI:  https://doi.org/10.1080/0142159X.2024.2430365
  8. Clin Pract. 2024 Dec 31. pii: 8. [Epub ahead of print]15(1):
      Introduction: The survival in early breast cancer (BC) has been significantly improved thanks to numerous new drugs. Nevertheless, the information about the need for systemic therapy, especially chemotherapy, represents an additional stress factor for patients. A common coping strategy is searching for further information, traditionally via search engines or websites, but artificial intelligence (AI) is also increasingly being used. Who provides the most reliable information is now unclear. Material and Methods: AI in the form of ChatGPT 3.5 and 4.0, Google, and the website of PINK, a provider of a prescription-based mobile health app for patients with BC, were compared to determine the validity of the statements on the five most common side effects of nineteen approved drugs and one drug with pending approval (Ribociclib) for the systemic treatment of BC. For this purpose, the drugs were divided into three groups: chemotherapy, targeted therapy, and endocrine therapy. The reference for the comparison was the prescribing information of the respective drug. A congruence score was calculated for the information on side effects: correct information (2 points), generally appropriate information (1 point), and otherwise no point. The information sources were then compared using a Friedmann test and a Bonferroni-corrected post-hoc test. Results: In the overall comparison, ChatGPT 3.5 received the best score with a congruence of 67.5%, followed by ChatGPT 4.0 with 67.0%, PINK with 59.5%, and with Google 40.0% (p < 0.001). There were also significant differences when comparing the individual subcategories, with the best congruence achieved by PINK (73.3%, p = 0.059) in the chemotherapy category, ChatGPT 4.0 (77.5%; p < 0.001) in the targeted therapy category, and ChatGPT 3.5 (p = 0.002) in the endocrine therapy category. Conclusions: Artificial intelligence and professional online information websites provide the most reliable information on the possible side effects of the systemic treatment of early breast cancer, but congruence with prescribing information is limited. The medical consultation should still be considered the best source of information.
    Keywords:  ChatGPT; Google; PINK; artificial intelligence; breast cancer; side effects; systemic therapy
    DOI:  https://doi.org/10.3390/clinpract15010008
  9. World J Gastroenterol. 2025 Jan 21. 31(3): 101092
       BACKGROUND: Patients with hepatitis B virus (HBV) infection require chronic and personalized care to improve outcomes. Large language models (LLMs) can potentially provide medical information for patients.
    AIM: To examine the performance of three LLMs, ChatGPT-3.5, ChatGPT-4.0, and Google Gemini, in answering HBV-related questions.
    METHODS: LLMs' responses to HBV-related questions were independently graded by two medical professionals using a four-point accuracy scale, and disagreements were resolved by a third reviewer. Each question was run three times using three LLMs. Readability was assessed via the Gunning Fog index and Flesch-Kincaid grade level.
    RESULTS: Overall, all three LLM chatbots achieved high average accuracy scores for subjective questions (ChatGPT-3.5: 3.50; ChatGPT-4.0: 3.69; Google Gemini: 3.53, out of a maximum score of 4). With respect to objective questions, ChatGPT-4.0 achieved an 80.8% accuracy rate, compared with 62.9% for ChatGPT-3.5 and 73.1% for Google Gemini. Across the six domains, ChatGPT-4.0 performed better in terms of diagnosis, whereas Google Gemini demonstrated excellent clinical manifestations. Notably, in the readability analysis, the mean Gunning Fog index and Flesch-Kincaid grade level scores of the three LLM chatbots were significantly higher than the standard level eight, far exceeding the reading level of the normal population.
    CONCLUSION: Our results highlight the potential of LLMs, especially ChatGPT-4.0, for delivering responses to HBV-related questions. LLMs may be an adjunctive informational tool for patients and physicians to improve outcomes. Nevertheless, current LLMs should not replace personalized treatment recommendations from physicians in the management of HBV infection.
    Keywords:  Accuracy; ChatGPT-3.5; ChatGPT-4.0; Google Gemini; Hepatitis B infection
    DOI:  https://doi.org/10.3748/wjg.v31.i3.101092
  10. Oral Surg Oral Med Oral Pathol Oral Radiol. 2025 Jan 11. pii: S2212-4403(25)00002-1. [Epub ahead of print]
       OBJECTIVES: Artificial intelligence chatbots have demonstrated feasibility and efficacy in improving health outcomes. In this study, responses from 5 different publicly available AI chatbots-Bing, GPT-3.5, GPT-4, Google Bard, and Claude-to frequently asked questions related to oral cancer were evaluated.
    STUDY DESIGN: Relevant patient-related frequently asked questions about oral cancer were obtained from two main sources: public health websites and social media platforms. From these sources, 20 oral cancer-related questions were selected. Four board-certified specialists in oral medicine/oral and maxillofacial pathology assessed the answers using modified version of the global quality score on a 5-point Likert scale. Additionally, readability was measured using the Flesch-Kincaid Grade Level and Flesch Reading Ease scores. Responses were also assessed for empathy using a validated 5-point scale.
    RESULTS: Specialists ranked GPT-4 with highest total score of 17.3 ± 1.5, while Bing received the lowest at 14.9 ± 2.2. Bard had the highest Flesch Reading Ease score of 62 ± 7; and ChatGPT-3.5 and Claude received the lowest scores (more challenging readability). GPT-4 and Bard emerged as the most superior chatbots in terms of empathy and accurate citations on patient-related frequently asked questions pertaining to oral cancer. GPT-4 had highest overall quality, whereas Bing showed the lowest level of quality, empathy, and accuracy for citations.
    CONCLUSION: GPT-4 demonstrated the highest quality responses to frequently asked questions pertaining to oral cancer. Although impressive in their ability to guide patients on common oral cancer topics, most chatbots did not perform well when assessed for empathy or citation accuracy.
    DOI:  https://doi.org/10.1016/j.oooo.2024.12.028
  11. Clin Diabetes. 2025 ;43(1): 53-58
      This study aimed to assess diabetes health information found on TikTok and quantify misinformation on TikTok. The authors assessed 171 videos through two health literacy tools, DISCERN and the Patient Education Materials Assessment Tool for Audiovisual Materials, to rate the understandability and actionability of online medical content. The findings from this study encourage health care professionals to use social media platforms to provide factual information about diabetes and advise online health care consumers to use reputable sources such as trusted diabetes organizations' social media accounts, which tend to validate content with clinicians.
    DOI:  https://doi.org/10.2337/cd24-0042
  12. Health Informatics J. 2025 Jan-Mar;31(1):31(1): 14604582251315587
      Objective: This study aimed to evaluate the presentation suitability and readability of ChatGPT's responses to common patient questions, as well as its potential to enhance readability. Methods: We initially analyzed 30 ChatGPT responses related to knee osteoarthritis (OA) on March 20, 2023, using readability and presentation suitability metrics. Subsequently, we assessed the impact of detailed and simplified instructions provided to ChatGPT for same responses, focusing on readability improvement. Results: The readability scores for responses related to knee OA significantly exceeded the recommended sixth-grade reading level (p < .001). While the presentation of information was rated as "adequate," the content lacked high-quality, reliable details. After the intervention, readability improved slightly for responses related to knee OA; however, there was no significant difference in readability between the groups receiving detailed versus simplified instructions. Conclusions: Although ChatGPT provides informative responses, they are often difficult to read and lack sufficient quality. Current capabilities do not effectively simplify medical information for the general public. Technological advancements are needed to improve user-friendliness and practical utility.
    Keywords:  ChatGPT; artificial intelligence; conversational agent; online medical information; readability
    DOI:  https://doi.org/10.1177/14604582251315587
  13. Reg Anesth Pain Med. 2025 Jan 19. pii: rapm-2024-106231. [Epub ahead of print]
       BACKGROUND: This study evaluated the effectiveness of large language models (LLMs), specifically ChatGPT 4o and a custom-designed model, Meta-Analysis Librarian, in generating accurate search strings for systematic reviews (SRs) in the field of anesthesiology.
    METHODS: We selected 85 SRs from the top 10 anesthesiology journals, according to Web of Science rankings, and extracted reference lists as benchmarks. Using study titles as input, we generated four search strings per SR: three with ChatGPT 4o using general prompts and one with the Meta-Analysis Librarian model, which follows a structured, Population, Intervention, Comparator, Outcome-based approach aligned with Cochrane Handbook standards. Each search string was used to query PubMed, and the retrieved results were compared with the PubMed retrieved studies from the original search string in each SR to assess retrieval accuracy. Statistical analysis compared the performance of each model.
    RESULTS: Original search strings demonstrated superior performance with a 65% (IQR: 43%-81%) retrieval rate, which was statistically different from both LLM groups in PubMed retrieved studies (p=0.001). The Meta-Analysis Librarian achieved a superior median retrieval rate to ChatGPT 4o (median, (IQR); 24% (13%-38%) vs 6% (0%-14%), respectively).
    CONCLUSION: The findings of this study highlight the significant advantage of using original search strings over LLM-generated search strings in PubMed retrieval studies. The Meta-Analysis Librarian demonstrated notable superiority in retrieval performance compared with ChatGPT 4o. Further research is needed to assess the broader applicability of LLM-generated search strings, especially across multiple databases.
    Keywords:  Methods; Nerve Block; TECHNOLOGY
    DOI:  https://doi.org/10.1136/rapm-2024-106231
  14. Dent Traumatol. 2025 Jan 23.
       BACKGROUND/AIM: The use of AI-driven chatbots for accessing medical information is increasingly popular among educators and students. This study aims to assess two different ChatGPT models-ChatGPT 3.5 and ChatGPT 4.0-regarding their responses to queries about traumatic dental injuries, specifically for dental students and professionals.
    MATERIAL AND METHODS: A total of 40 questions were prepared, divided equally between those concerning definitions and diagnosis and those on treatment and follow-up. The responses from both ChatGPT versions were evaluated on several criteria: quality, reliability, similarity, and readability. These evaluations were conducted using the Global Quality Scale (GQS), the Reliability Scoring System (adapted DISCERN), the Flesch Reading Ease Score (FRES), the Flesch-Kincaid Reading Grade Level (FKRGL), and the Similarity Index. Normality was checked with the Shapiro-Wilk test, and variance homogeneity was assessed using the Levene test.
    RESULTS: The analysis revealed that ChatGPT 3.5 provided more original responses compared to ChatGPT 4.0. According to FRES scores, both versions were challenging to read, with ChatGPT 3.5 having a higher FRES score (39.732 ± 9.713) than ChatGPT 4.0 (34.813 ± 9.356), indicating relatively better readability. There were no significant differences between the ChatGPT versions regarding GQS, DISCERN, and FKRGL scores. However, in the definition and diagnosis section, ChatGPT 4.0 had a statistically higher quality score than ChatGPT 3.5. In contrast, ChatGPT 3.5 provided more original answers in the treatment and follow-up section. For ChatGPT 4.0, the readability and similarity rates for the definition and diagnosis section were higher than those for the treatment and follow-up section. No significant differences were observed between ChatGPT 3.5's DISCERN, FRES, FKRGL, and similarity index measurements by topic.
    CONCLUSIONS: Both ChatGPT versions offer high-quality and original information, though they present challenges in readability and reliability. They are valuable resources for dental students and professionals but should be used in conjunction with additional sources of information for a comprehensive understanding.
    Keywords:  ChatGPT; quality; traumatic dental injury
    DOI:  https://doi.org/10.1111/edt.13042
  15. Front Digit Health. 2024 ;6 1480381
       Introduction: Knee osteoarthritis (OA) significantly impacts the quality of life of those afflicted, with many patients eventually requiring surgical intervention. While Total Knee Arthroplasty (TKA) is common, it may not be suitable for younger patients with unicompartmental OA, who might benefit more from High Tibial Osteotomy (HTO). Effective patient education is crucial for informed decision-making, yet most online health information has been found to be too complex for the average patient to understand. AI tools like ChatGPT may offer a solution, but their outputs often exceed the public's literacy level. This study assessed whether a customised ChatGPT could be utilized to improve readability and source accuracy in patient education on Knee OA and tibial osteotomy.
    Methods: Commonly asked questions about HTO were gathered using Google's "People Also Asked" feature and formatted to an 8th-grade reading level. Two ChatGPT-4 models were compared: a native version and a fine-tuned model ("The Knee Guide") optimized for readability and source citation through Instruction-Based Fine-Tuning (IBFT) and Reinforcement Learning from Human Feedback (RLHF). The responses were evaluated for quality using the DISCERN criteria and readability using the Flesch Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).
    Results: The native ChatGPT-4 model scored a mean DISCERN score of 38.41 (range 25-46), indicating poor quality, while "The Knee Guide" scored 45.9 (range 33-66), indicating moderate quality. Cronbach's Alpha was 0.86, indicating good interrater reliability. "The Knee Guide" achieved better readability with a mean FKGL of 8.2 (range 5-10.7, ±1.42) and a mean FRES of 60 (range 47-76, ±7.83), compared to the native model's FKGL of 13.9 (range 11-16, ±1.39) and FRES of 32 (range 14-47, ±8.3). These differences were statistically significant (p < 0.001).
    Conclusions: Fine-tuning ChatGPT significantly improved the readability and quality of HTO-related information. "The Knee Guide" demonstrated the potential of customized AI tools in enhancing patient education by making complex medical information more accessible and understandable.
    Keywords:  ChatGPT; DISCERN criteria; artificial intelligence; high tibial osteotomy; knee osteoarthritis; patient education; readability
    DOI:  https://doi.org/10.3389/fdgth.2024.1480381
  16. PLoS One. 2025 ;20(1): e0312832
       BACKGROUND: This study aimed to investigate the quality and readability of online English health information about dental sensitivity and how patients evaluate and utilize these web-based information.
    METHODS: The credibility and readability of health information was obtained from three search engines. We conducted searches in "incognito" mode to reduce the possibility of biases. Quality assessment utilized JAMA benchmarks, the DISCERN tool, and HONcode. Readability was analyzed using the SMOG, FRE, and FKGL indices.
    RESULTS: Out of 600 websites, 90 were included, with 62.2% affiliated with dental or medical centers, among these websites, 80% exclusively related to dental implant treatments. Regarding JAMA benchmarks, currency was the most commonly achieved and 87.8% of websites fell into the "moderate quality" category. Word and sentence counts ranged widely with a mean of 815.7 (±435.4) and 60.2 (±33.3), respectively. FKGL averaging 8.6 (±1.6), SMOG scores averaging 7.6 (±1.1), and FRE scale showed a mean of 58.28 (±9.1), with "fair difficult" being the most common category.
    CONCLUSION: The overall evaluation using DISCERN indicated a moderate quality level, with a notable absence of referencing. JAMA benchmarks revealed a general non-adherence among websites, as none of the websites met all of the four criteria. Only one website was HON code certified, suggesting a lack of reliable sources for web-based health information accuracy. Readability assessments showed varying results, with the majority being "fair difficult". Although readability did not significantly differ across affiliations, a wide range of the number of words and sentences count was observed between them.
    DOI:  https://doi.org/10.1371/journal.pone.0312832
  17. Trop Med Infect Dis. 2025 Jan 07. pii: 16. [Epub ahead of print]10(1):
      Human rabies is preventable but almost always fatal once symptoms appear, causing 59,000 global deaths each year. Limited awareness and inconsistent access to post-exposure prophylaxis hinder prevention efforts. To identify gaps and opportunities for improvement in online rabies information, we assessed the readability, understandability, actionability, and completeness of online public rabies resources from government and health agencies in Australia and similar countries, with the aim of identifying gaps and opportunities for improvement. We identified materials via Google and public health agency websites, assessing readability using the Simple Measure of Gobbledygook (SMOG) index and understandability and actionability with the Patient Education Materials Tool for Print materials (PEMAT-P). Completeness was assessed using a framework focused on general and vaccine-specific rabies information. An analysis of 22 resources found a median readability of grade 13 (range: 10-15), with a mean understandability of 66% and mean actionability of 60%; both below recommended thresholds. Mean completeness was 79% for general rabies information and 36% for vaccine-specific information. Visual aids were under-utilised, and critical vaccine-specific information was often lacking. These findings highlight significant barriers in rabies information for the public, with most resources requiring a high literacy level and lacking adequate understandability and actionability. Improving readability, adding visual aids, and enhancing vaccine-related content could improve accessibility and support wider prevention efforts.
    Keywords:  health literacy; patient information; prevention; public health; rabies; readability; vaccination
    DOI:  https://doi.org/10.3390/tropicalmed10010016
  18. PLoS One. 2025 ;20(1): e0317032
       OBJECTIVE: To evaluate and compare the readability of information on different treatment options for breast cancer from WeChat public accounts, propose targeted improvement strategies based on the evaluation of the results of the various treatment options, and provide a reference for producers of WeChat public accounts from which to write highly readable information regarding breast cancer treatment options.
    METHODS: With "breast cancer" as keywords in April 2021, searches were implemented on Sogou WeChat website (https://weixin.sogou.com/) and WeChat mobile app. The selected WPAs were aimed to provided breast cancer health information, and the four latest articles of each WPA were included in the evaluation. Two independent observers assessed the readability of the articles through the Suitability Assessment of Materials (SAM) tool, and compared the readability of information on different treatment options, i.e., surgical treatment, medical treatment, complementary and alternative medicine (CAM), and comprehensive treatment.
    RESULTS: A total of 136 articles on different types of breast cancer treatments from 37 WeChat public accounts were included in the present study. The median SAM score was 50 (IQR, 41-60). In terms of treatment options, the readability of articles in the CAM category scored higher in the content 75 (IQR, 63-81), learning stimulation and motivation 75 (IQR, 50-83) and cultural appropriateness 75 (IQR, 75-75) categories than in the medical and surgical treatment categories (P < 0.05). Additionally, the readability of articles in the CAM category scored higher in the cultural appropriateness 75 (IQR, 75-75) category than those for comprehensive and medical treatment (P < 0.05).
    CONCLUSIONS: The overall readability of information on breast cancer treatment options in WeChat public accounts was in the lower portion of the "adequate" level. The readability of articles on medical treatment options is poor, especially on clinical trial articles, which could be improved in terms of content, graphics, learning stimulation, and motivation to make them more suitable for public reading.
    DOI:  https://doi.org/10.1371/journal.pone.0317032
  19. Contemp Clin Dent. 2024 Oct-Dec;15(4):15(4): 292-294
      Dental students often prefer social media for its accessibility and low cost but must critically evaluate the content before applying it in practice. This study analyzed YouTube content on socket shielding. A new Google account was created to search for "Socket Shield Technique" and "Partial Extraction Therapy." Eligible videos were assessed for quality using the Global Quality (GQ) tool and the DISCERN tool. Statistical analysis was performed using the Statistical Package for the Social Sciences software. Results showed an average of 28.35 likes per video, no dislikes, and substandard content indicated by a DISCERN score of 25.25 ± 2.4 and a GQ score of 1.82 ± 0.38. The study concludes that video content on socket shielding is not reliable, emphasizing the importance of clinical observation and hands-on practice.
    Keywords:  Partial extraction therapy; YouTube; social media; socket-shield technique
    DOI:  https://doi.org/10.4103/ccd.ccd_317_24
  20. Surg Pract Sci. 2023 Sep;14 100199
       Background: General surgery residents frequently access YouTube® for educational walkthroughs of surgical procedures. The aim of this study is to evaluate the educational quality of YouTube® video walkthroughs on Laparoscopic Roux-en-Y gastric bypass (LRYGB) using a validated video assessment tool.
    Methods: A retrospective review of YouTube® videos was conducted for "laparoscopic Roux-en-Y gastric bypass", "laparoscopic RYGB", and "laparoscopic gastric bypass." The top 100 videos from three YouTube® searches were gathered and duplicates were removed. Included videos were categorized as Physician (produced by individual physician), Academic (university/medical school), or Society (professional surgical society) and rated by three independent investigators using the LAParoscopic surgery Video Educational GuidelineS (LAP-VEGaS) video assessment tool (0-18). The data were analyzed using one-way ANOVA with Bonferroni correction and Spearman's correlation test.
    Results: Of 300 videos gathered, 31 unique videos met selection criteria and were analyzed. The average LAP-VEGaS score was 8.67 (SD 3.51). Society videos demonstrated a significantly higher mean LAP-VEGaS score than Physician videos (p = 0.023). Most videos lacked formal case presentation (71%), intraoperative findings (81%), and operative time (76%). No correlation was demonstrated between LAP-VEGaS scores and number of likes or views, video length, or upload date.
    Conclusions: LRYGB training videos on YouTube® generally do not adhere to the LAP-VEGaS guidelines and are of poor educational quality, signaling areas of improvement for educators.
    Keywords:  Bariatric surgery; Educational quality; Gastric bypass; Surgical training; Video
    DOI:  https://doi.org/10.1016/j.sipas.2023.100199
  21. J Cardiovasc Electrophysiol. 2025 Jan 20.
       BACKGROUND: Permanent pacemaker (PPM) implantation is a commonly performed procedure. Patients increasingly use the Internet for information on medical interventions. We aimed to assess the quality of videos discussing PPM implantation on YouTube for patient consumption.
    METHODS: YouTube was searched on October 19, 2022, for "PPM implantation" and "Pacemaker." The first 100 results from each search were screened: all English language videos containing predominant discussion of transvenous PPMs were included; YouTube shorts and advertisements were excluded. Two authors independently assessed videos for information content based on criteria generated using established patient resources supplemented by expert consensus. Video reliability and quality were assessed using a novel scoring system.
    RESULTS: Thirty-three videos with a cumulative total of 5 864 488 views were included. No video contained all essential information criteria. The average number of essential criteria covered was 8/32 (standard deviation 4.8). Peri-operative management was, particularly, poorly covered: no item relating to preoperative management or postoperative care was covered by more than 40% of videos. None of the videos fulfilled all quality criteria, with a median score of 7.5/13 (interquartile range 6.5-8). Videos performed, particularly, poorly on providing balanced messages, creator disclosures, attribution of source content, and indicating when videos were made.
    CONCLUSION: YouTube videos of PPM implantation do not contain sufficient information to allow patients to gain an appropriate understanding of this procedure. Furthermore, the information presented is of insufficient quality to support decision-making. There is a need for a professionally regulated, comprehensive audiovisual patient resource on PPM implantation.
    Keywords:  YouTube; pacemaker; patient information; video
    DOI:  https://doi.org/10.1111/jce.16540
  22. Front Public Health. 2024 ;12 1472583
       Background: The prevalence of lymphedema is rising, necessitating accurate diagnostic and treatment information for affected patients. Short video-sharing platforms facilitate access to such information but require validation regarding the reliability and quality of the content presented. This study aimed to assess the reliability and quality of lymphedema-related information on Chinese short video-sharing platforms.
    Methods: We collected 111 video samples addressing the diagnosis and treatment of lymphedema from four platforms: TikTok, Bilibili, WeChat, and Microblog. Two independent surgeons evaluated each video for content comprehensiveness, quality (using the Global Quality Score), and reliability (using the modified DISCERN tool). The videos from different sources were subsequently compared and analyzed.
    Results: Out of 111 videos analyzed, 66 (59.46%) were uploaded by medical professionals, including breast surgeons, vascular surgeons, plastic surgeons, physical therapists, and gynecologists, while 45 (40.54%) were shared by non-medical professionals such as science bloggers, medical institutions, and lymphedema patients. Patient-uploaded videos received the highest engagement, with median likes of 2,257 (IQR: 246.25-10998.25) and favorites of 399 (IQR: 94.5-1794.75). 13 videos (11.71%) contained inaccuracies. Medical professionals' videos generally showed higher content comprehensiveness, particularly those by plastic surgeons, compared to non-medical professionals. The GQS and modified DISCERN tool were used to assess video quality and reliability respectively, with medical professionals scoring higher on both metrics (z = 3.127, p = 0.002; z = 2.010, p = 0.044). The quality and reliability of recommendations provided by plastic surgeons surpassed that of other medical professionals (χ 2 = 16.196, p = 0.003; χ 2 = 9.700, p = 0.046). No significant differences in video quality and reliability were found among the three categories of non-medical professionals (χ 2 = 3.491, p = 0.175; χ 2 = 2.098, p = 0.350).
    Conclusion: Our study shows that lymphedema-related videos on short video platforms vary widely in quality. Videos by medical professionals are generally more accurate and of higher quality than those by non-professionals. However, patient-uploaded videos often get more engagement due to their relatability. To ensure public access to reliable information, establishing basic standards for this content is essential.
    Keywords:  TikTok; information quality; lymphedema; short videos; social media
    DOI:  https://doi.org/10.3389/fpubh.2024.1472583
  23. Breast J. 2025 ;2025 9487931
      Aim: Purpose of this study is to investigate the quality and reliability of YouTube video contents on prophylactic mastectomy. Material and Methods: The search terms "prophylactic mastectomy," "prophylactic mastectomy surgery," "preventive surgery for breast cancer," "risk-reducing mastectomy," and "prophylactic mastectomy and breast reconstruction" were searched on YouTube. The uploader, video content, length (seconds), upload date, number of days since upload date, number of views, number of comments, and likes were recorded and evaluated. Finally, the videos included in the study were evaluated using modified Quality Criteria for Consumer Health Information (DISCERN) and Global Quality Scale (GQS). Results: The total number of views of the 50 videos reviewed in the study was found as 3.674.469. The mean DISCERN score of the two observers was calculated as 3.35 ± 1, and the videos were found to be of medium reliability. The mean GQS score of all videos was 3.39 ± 0.9 and the videos were of medium quality. The researchers gave 1-2 points (misleading) to 7 (14%) videos, 3 points (somewhat helpful) to 20 (40%) videos, 4 points (beneficial) to 16 (32%) videos, and 5 points (excellent) to 7 (14%) videos. Conclusion: In our study, we found that the videos uploaded by doctors were of good quality, the videos uploaded by health channels were of medium quality, and the videos uploaded by patients were of poor quality and misleading. The videos with health contents should be evaluated by the relevant specialists, and only useful videos should be broadcast.
    Keywords:  DISCERN; GQS; YouTube; prophylactic mastectomy; quality
    DOI:  https://doi.org/10.1155/tbj/9487931
  24. Z Evid Fortbild Qual Gesundhwes. 2025 Jan 20. pii: S1865-9217(24)00265-4. [Epub ahead of print]
       INTRODUCTION: Web-based health information can support health-related decisions if it is of high quality, i. e. accurate, understandable and barrier-free. Our study systematically searched for German-language, web-based health information on the prevention and prediction of food allergies in children and assessed their content and quality.
    METHODS: In July 2022, four researchers conducted a systematic Google search for German-language web-based health information (HI) on the prediction and prevention of food allergies in children. They searched independently of each other with a predefined search algorithm. Two independent reviewers analyzed the data using qualitative and quantitative content analysis (step/analysis 1) and assessed the quality of HI (step/analysis 2) using a comprehensive criteria catalog (transparency, text design, content, language, presentation of frequencies and statistical information, visualization, and accessibility).
    RESULTS: The systematic search yielded 59 websites, which were provided by nine sectors. The most frequent sectors were "Health portals and expert opinions" and "Guidelines/scientific and medical specialized information" (22 % each). The content analysis (step 1) showed, among other things, that the topic of prediction was only implicitly addressed. 49 materials (83 %) contained guideline-compliant information. However, there were also 26 materials (44 %) whose content was not in line with the current S3 guideline on allergy prevention. Quality assessment (step 2) revealed that only a small number of the 43 HI received good or very good ratings regarding the transparency (n = 3, 7 %) and content (n = 9, 21 %) criteria. The criterion concerning frequencies and statistical information was rated good or very good quality in only 11 HI (26 %). Almost all HI met the quality criteria for language (n = 38, 88 %), text design (n = 43, 100 %), and visualization (n = 43, 100 %). None of the evaluated HI was given a good or very good rating in terms of accessibility criteria. The analysis by sector revealed only minor differences (Mean of the seven criteria: 56-69 %).
    CONCLUSION: The quality of the available web-based health information on the prevention and prediction of food allergies in children is highly heterogeneous. There is need for improvement in terms of accessibility, content (e. g., selective presentation of prevention measures), and transparency (e. g., missing details of contacts). Further research is needed for expanding the user perspective and analyzing social media in the context of prediction and prevention of food allergies in children.
    Keywords:  Allergieprävention; Allergy prevention; Children; Digital information; Digitale Informationen; Eltern; Food; Kinder; Nahrungsmittel; Parents
    DOI:  https://doi.org/10.1016/j.zefq.2024.11.010
  25. Eur J Midwifery. 2025 ;9
       INTRODUCTION: During pregnancy, women rely on a variety of sources to obtain information. However, not all of these sources are equally reliable, and there is the concern that especially online information-seeking may increase pregnancy-related anxiety. This study examines to what extent different sources of pregnancy information are associated with concurrent pregnancy-related anxiety (RQ1) and changes in pregnancy-related anxiety throughout the pregnancy (RQ2).
    METHODS: This study was integrated into the ongoing Swedish Mom2B study (sub-study data collection: December 2022-April 2024), where women complete weekly questionnaires via a research app. Each trimester, they received questions about their use of information sources and pregnancy-related anxiety.
    RESULTS: Our sample consisted of 751 pregnant women (273 with at least two waves of data). Using the midwife (β= -0.14, p<0.001; 95% CI: -3.32 - -1.13) or social circle (β= -0.08, p<0.05; 95% CI: -2.83 - -0.07) as a source of pregnancy-and childbirth-related information was associated with lower levels of pregnancy-related anxiety. In contrast, reliance on online sources for information was associated with higher levels of anxiety (β=0.14, p<0.001; 95% CI: 1.52-5.03). Except for (e-)books, which lowered the odds of improving anxiety (OR=0.62, p<0.01; 95% CI: 0.45-0.85), none of the information sources predicted changes in pregnancy-related anxiety over time.
    CONCLUSIONS: Not all information sources play an equal role in relation to pregnancy-related anxiety. Interpersonal sources in particular may help mitigate anxiety. However, future research with more nuanced methodologies and shorter measurement intervals could clarify possible causal relationships and refine our understanding of how various information sources affect pregnancy-related anxiety over time.
    Keywords:  information seeking behavior; maternal behavior; maternal mental health; online health information seeking; pregnancy-related anxiety
    DOI:  https://doi.org/10.18332/ejm/197169
  26. Cureus. 2025 Jan;17(1): e77759
      Background Social media (SM) platforms are commonly used in Saudi Arabia, even for health information. SM platforms allow users to have conversations, share information, and create web content. Given the growing dependence on social media for health-related concerns, it is critical to understand how Saudis use these platforms to get health information. This study aimed to determine the Saudi population's attitude and awareness regarding health information sought on SM. Subject and methods This cross-sectional study was conducted among adults in Riyadh, Kingdom of Saudi Arabia, from September to October 2024. A self-administered questionnaire was distributed randomly in the Medical City King Saud University family medicine clinic. The questionnaire includes socio-demographic data (i.e., age, gender, marital status, etc.), the most commonly used type of SM, and various questions to assess the knowledge and influence of SM on health information.  Results Among the 330 participants, 117 (63%) were female respondents, and 126 (38.2%) were between 31 and 40 years old. WhatsApp was the most prominent type of SM used at 192 (58.2%). Disease or medical problems were the most notable health information being seen online at 172 (52.1%), and "to be informed" was the most common reason for seeking health information online at 237 (72.4%). The perception of unemployed female respondents that health information obtained from SM is reliable was significantly higher than that of unemployed male respondents (p<0.05). Surprisingly, male participants usually do believe SM can enhance awareness (p = 0.015). Conclusion The findings of this study suggest that SM influences the behavior of the adult population seeking health information in Saudi Arabia. Female participants tended to believe that the health information obtained from SM was credible. To be more informed was the primary reason for seeking health information online. There is a need to educate patients visiting family medicine clinics about the reliability of health information obtained online.
    Keywords:  awareness; health information; public health; saudi arabia; social media
    DOI:  https://doi.org/10.7759/cureus.77759