bims-librar Biomed News
on Biomedical librarianship
Issue of 2025–06–08
thirteen papers selected by
Thomas Krichel, Open Library Society



  1. Adv Physiol Educ. 2025 Jun 06.
       INTRODUCTION: There are concerns from Physiology specialists in Faculty of Medicine, University of Ruhuna on possible wide usage of easily accessible but potentially unreliable online materials to study. The full picture is not clear as this area was underexplored.
    METHODS: An observational cross-sectional study was conducted using an online, self-administered questionnaire which was designed specifically for this study and has not been previously validated to evaluate the prevalence, types, and practices of online resource usage to study Physiology. All second-year medical students who had recently completed the second MBBS examination in 2024 were recruited.
    RESULTS: Out of 185 students who responded, 77.8% used recommended textbooks as the preferred choice to study physiology. On the other hand, 100% reported use of online resources. Most students (n=171) used smartphones for internet access. ChatGPT was the preferred online tool for 71.9% (n=133), while 76.2% (n=141) used YouTube to watch online video clips to understand Physiology concepts. Notably, 54.05% (n=100) used YouTube videos at random, while 16.7% used Ninja Nerd and 16.2% used Khan Academy. A total of 52% (n=96) utilized video materials on most days while 81% of students first searched online for answers before consulting their lecturers. Most students (76.6%) mentioned that videos are easy to understand. Over half (54.57%) strongly agreed or agreed with 'fact-checking' using recommended textbooks, articles, or lecture materials provided by the Physiology Department.
    CONCLUSION: Most students prefer online resources like ChatGPT and YouTube for learning Physiology, showing a shift towards digital tools. Although many students fact-check content, clear guidance on selecting reliable online materials is necessary, given their widespread usage.
    Keywords:  Learning Physiology; Medical students; Online resources
    DOI:  https://doi.org/10.1152/advan.00061.2025
  2. PLoS One. 2025 ;20(6): e0325513
       BACKGROUND: Integrating the internet into daily life has profoundly influenced the public's behavior of seeking health information. Free access to cyberspace created a fertile environment for the spread of oral health misinformation, which can have a detrimental impact on the public's oral health. The prevalence of oral health misinformation in Jordan has not been investigated; therefore, it is crucial to understand how oral health misinformation originates and what contributes to its dissemination.
    OBJECTIVES: This study aims to examine the prevalence of oral health misinformation published on web pages in Jordan and to offer insight into the public's information-seeking behaviors regarding oral health.
    METHODS: This study is a mixed methods infodemiological analysis of oral health misinformation. A systematic content analysis was executed on web pages published in the Arabic language in Jordan from 2019 to 2023.
    RESULTS: 704 web pages were retrieved, of which 320 relevant web pages were included in the content analysis. Among these, 193 web pages (60.3%) published oral health misinformation. Publishers without a professional background published 185 web pages (95.9%) of the total misinformation-expressing web pages. According to the dental field, the highest frequency of misinformation occurred in oral medicine-101 web pages (52.3%). The validity of published oral health information was significantly influenced by the publishers' interest (P = 0.006), the articles' main themes (P = 0.005), and the publishers' professional background (P < 0.001). Contextual analysis of oral health misinformation showed significant differences among dental fields (P = 0.019), with the most frequent occurences related to causes (18.8%), home remedies (15.7%), and treatment (15.5%). Geographical variations in interest in oral health searches were observed across Jordanian governorates (P < 0.001), and temporal trends in interest varied significantly across the five-year period(P = 0.019).
    CONCLUSION: The findings of this study suggest a need for public health interventions to restrict the dissemination of oral health misinformation.
    DOI:  https://doi.org/10.1371/journal.pone.0325513
  3. Cochrane Evid Synth Methods. 2024 Jun;2(6): e12078
       Introduction: One of the main tasks in information retrieval is the development of Boolean search strategies for systematic searches in bibliographic databases. This includes the identification of free-text terms and controlled vocabulary. IQWiG has previously implemented its objective approach for search strategy development using a fee-based text analysis software. However, this implementation is not fully automated, due to a lack of technical options. The aim of our project was to develop a text analysis tool for the development of Boolean search strategies using R.
    Methods: We adopt an incremental approach to software development, with the first goal being to develop a minimum viable product for the previously defined use cases. To create an interactive user interface, we use the shiny framework.
    Results: Our newly developed shiny app searchbuildR is a text analysis tool with a point-and-click user interface, that automatically extracts and ranks terms from titles, abstracts, and MeSH terms of a given test set of PubMed records. It returns searchable, interactive tables of free-text and MeSH terms. Each free-text term can also be viewed within its original context in the full titles and abstracts or in a user-defined word window. In addition, 2-word combinations are extracted and also provided as an interactive table to help the user identify free-text term combinations, that can be searched with proximity operators in Boolean searches. The results can be exported to a CSV file. The new implementation with searchbuildR was evaluated by validating the text analysis results against the results of the previously used fee-based software.
    Conclusions: QWiG has developed the shiny app searchbuildR to support the development of search strategies in systematic reviews. It is open source and can be used by researchers and other information specialists without extensive R or programming skills. The package code is openly available on GitHub at www.github.com/IQWiG/searchbuildR.
    Keywords:  data mining; evidence synthesis; information storage and retrieval; natural language processing; review literature as topic; systematic reviews as topic; user‐centered design
    DOI:  https://doi.org/10.1002/cesm.12078
  4. BMC Oral Health. 2025 May 31. 25(1): 871
       BACKGROUND: Artificial intelligence chatbots have the potential to inform and guide patients by providing human-like responses to questions about dental and maxillofacial prostheses. Information regarding the accuracy and qualifications of these responses is limited. This in-silico study aimed to evaluate the accuracy, quality, readability, understandability, and actionability of the responses from DeepSeek-R1, ChatGPT-o1, ChatGPT-4, and Dental GPT chatbots.
    METHODS: Four chatbots were queried about 35 of the most frequently asked patient questions about their prostheses. The accuracy, quality, understandability, and actionability of the responses were assessed by two prosthodontists using five-point Likert scale, Global Quality Score, and Patient Education Materials Assessment Tool for Printed Materials scales, respectively. Readability was scored using the Flesch-Kincaid Grade Level and Flesch Reading Ease. The agreement was assessed using the Cohen Kappa test. Differences between chatbots were analyzed using the Kruskal-Wallis test, one-way ANOVA, and post-hoc tests.
    RESULTS: Chatbots showed a significant difference in accuracy and readability (p <.05). Dental GPT recorded the highest accuracy score, whereas ChatGPT-4 had the lowest. DeepSeek-R1 performed best, while Dental GPT had the lowest performance in readability. Quality, understandability, actionability, and reader education scores did not show significant differences.
    CONCLUSIONS: While accuracy may vary among chatbots, the domain-specific trained AI tool and ChatGPT-o1 demonstrated superior accuracy. Even if accuracy is high, misinformation in health care can have significant consequences. Enhancing the readability of the responses is essential, and chatbots should be chosen accordingly. The accuracy and readability of information from chatbots should be monitored for public health.
    Keywords:  Dental prosthesis; Digital health; Generative artificial intelligence; Maxillofacial prosthesis; Patient education; Public health
    DOI:  https://doi.org/10.1186/s12903-025-06267-w
  5. Skeletal Radiol. 2025 Jun 05.
       OBJECTIVE: To evaluate the quality, reliability, and educational value of social media videos available on TikTok, YouTube, and Facebook related to vertebroplasty and kyphoplasty using validated assessment tools.
    MATERIALS AND METHODS: A systematic search was conducted in two rounds: October 5-6, 2024, and again on October 19-20, 2024, on YouTube, TikTok, and Facebook using the keywords "vertebroplasty," "kyphoplasty," and "vertebral augmentation." Only publicly accessible English-language videos specifically addressing these procedures were included. Exclusion criteria were non-relevant, promotional, duplicate, or technically inadequate content. Two interventional radiologists independently assessed the videos using the DISCERN instrument (range, 15-75), the Journal of the American Medical Association (JAMA) benchmark criteria (0-4), and the Global Quality Score (GQS, 1-5). Statistical analyses were performed to identify correlations between video characteristics and quality scores.
    RESULTS: A total of 101 videos met the inclusion criteria (YouTube 85%, TikTok 8%, Facebook 7%). The mean DISCERN, JAMA, and GQS scores were 34 ± 1.45, 1.87 ± 0.07, and 2.18 ± 0.13, respectively, indicating poor overall quality. Videos with structured presentations and physician narration scored significantly higher, whereas video length correlated positively with quality up to a threshold of 460 s for DISCERN and 501 s for GQS. Video popularity (likes) showed no significant correlation with quality scores.
    CONCLUSION: Social media videos on vertebroplasty and kyphoplasty are generally of low educational quality and reliability. Videos presented by healthcare professionals and those with structured formats tend to score higher. These findings underscore the need for expert-driven, high-quality medical content to improve patient education and reduce misinformation on social media platforms.
    Keywords:  Education; Kyphoplasty; Medical communication; Social media; Vertebroplasty
    DOI:  https://doi.org/10.1007/s00256-025-04962-x
  6. JMIR Form Res. 2025 Jun 03. 9 e64630
       BACKGROUND: The widespread availability of health information online, coupled with the ease of access to the internet, has led pregnant women to rely heavily on online sources for pregnancy-related guidance. The internet-based information regarding nutrition enabled positive dietary changes for pregnant women. Although there are some important sources for pregnant women to collect their health information, some information increases maternal anxiety and difficulties based on a lack of information. Moreover, some women become confused due to conflicts on the same topics from different websites. However, concerns about the reliability and impact of this information have surfaced, contributing to heightened anxiety among expectant mothers. The importance of the quality of web-based information is increasingly recognized; however, no studies have evaluated the quality of nutrition-related information for pregnant women.
    OBJECTIVE: This study aims to bridge this research gap by assessing the quality of online health information concerning prenatal nutrition tailored to pregnant women.
    METHODS: This cross-sectional descriptive study was conducted through a Google keyword search on February 14, 2023. We used search terms, such as "pregnancy," "pregnant women," "diet," and "nutrition" and conducted an exhaustive search on Google. Using the Quality Evaluation Scoring Tool (QUEST), we meticulously evaluated the quality of the retrieved information.
    RESULTS: The top 20 Google-searched sites were evaluated using the QUEST tool. The average score was 11.7 points, ranging from 6 to 15, with most sites scoring between 11 and 15. Half of the websites lacked clear authorship and most gave weak or no attribution to specific scientific sources. While conflict of interest scored highest overall, with 60% showing no bias, some sites promoted products or specific interventions. Currency was inconsistent-only half were updated within 5 years. Complementarity received the lowest scores, with 70% lacking support for patient-physician relationships. The tone was generally positive, with 95% supporting their claims, though only one site used a balanced, well-reasoned tone. Discrepancies in cited guidelines on nutritional intake and inappropriate expressions about alcohol, weight management, and miscarriage raised concerns about the information's accuracy and appropriateness.
    CONCLUSIONS: Although many websites use cautious language to mitigate commercial influence, deficiencies persist in crucial areas for empowering informed decision-making among pregnant women. From our assessment of the results, it was found that incorrect evidence information is provided at the top of search results, which is easily accessible to users. The inadequacies in attributing authorship, clarifying conflicts of interest, and ensuring the currency of information pose substantial challenges to the reliability and usefulness of online health resources in prenatal nutrition. Since internet-based information is the most accessible, reliable evidence should be provided to protect everyone from misinformation, including shallow health literacy demographics, and from potential physical and psychological harm.
    Keywords:  QUEST; assessment; availability; decision-making; diet; dietary information; internet; internet-based; misinformation; nutrition; nutrition-related; online health information; physical harm; pregnancy; pregnancy-related guidance; pregnant women; prenatal nutrition; psychological harm; quality assessment; tools; web-based; web-based information; website assessment; women’s health
    DOI:  https://doi.org/10.2196/64630
  7. Knee Surg Sports Traumatol Arthrosc. 2025 Jun 01.
       PURPOSE: This study compares ChatGPT-4o, equipped with its deep research feature, and DeepSeek R1, equipped with its deepthink feature-both enabling real-time online data access-in generating responses to frequently asked questions (FAQs) about anterior cruciate ligament (ACL) surgery. The aim is to evaluate and compare their performance in terms of accuracy, clarity, completeness, consistency and readibility for evidence-based patient education.
    METHODS: A list of ten FAQs about ACL surgery was compiled after reviewing the Sports Medicine Fellowship Institution's webpages. These questions were posed to ChatGPT and DeepSeek in research-enabled modes. Orthopaedic sports surgeons evaluated the responses for accuracy, clarity, completeness, and consistency using a 4-point Likert scale. Inter-rater reliability of the evaluations was assessed using intraclass correlation coefficients (ICCs). In addition, a readability analysis was conducted using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease Score (FRES) metrics via an established online calculator to objectively measure textual complexity. Paired t tests were used to compare the mean scores of the two models for each criterion, with significance set at p < 0.05.
    RESULTS: Both models demonstrated high accuracy (mean scores of 3.9/4) and consistency (4/4). Significant differences were observed in clarity and completeness: ChatGPT provided more comprehensive responses (mean completeness 4.0 vs. 3.2, p < 0.001), while DeepSeek's answers were clearer and more accessible to laypersons (mean clarity 3.9 vs. 3.0, p < 0.001). DeepSeek had lower FKGL (8.9 vs. 14.2, p < 0.001) and higher FRES (61.3 vs. 32.7, p < 0.001), indicating greater ease of reading for a general audience. ICC analysis indicated substantial inter-rater agreement (composite ICC = 0.80).
    CONCLUSION: ChatGPT-4o, leveraging its deep research feature, and DeepSeek R1, utilizing its deepthink feature, both deliver high-quality, accurate information for ACL surgery patient education. While ChatGPT excels in comprehensiveness, DeepSeek outperforms in clarity and readability, suggesting that integrating the strengths of both models could optimize patient education outcomes.
    LEVEL OF EVIDENCE: Level V.
    Keywords:  ChatGPT; anterior cruciate ligament (ACL) surgery; artificial intelligence (AI); deep research; patient education
    DOI:  https://doi.org/10.1002/ksa.12711
  8. Clin Neurol Neurosurg. 2025 May 28. pii: S0303-8467(25)00269-0. [Epub ahead of print]255 108986
       BACKGROUND: Social media platforms are utilized by patients prior to scheduling formal consultations and also serve as a means of pursuing second opinions. Cerebrovascular pathologies require regular surveillance and specialized care. In recent years, chatbots have been trained to provide information on neurosurgical conditions. However, their ability to answer questions in vascular neurosurgery have not been evaluated in comparison to physician responses. Our study is a pilot study evaluating the accuracy, completeness, empathy, and readability of responses provided by ChatGPT 3.5 (Open AI, San Francisco) to standard specialist physician responses on social media.
    METHODS: We identified the top 50 cerebrovascular questions and their verified physician responses from Reddit. These questions were inputted into ChatGPT. Responses were anonymized and ranked on a Likert scale for accuracy regarding neurosurgical guidelines, completeness and empathy by four independent reviewers. Readability was assessed using standardized indexes (Flesch Reading Ease, Flesch Kincaid Grade, Gunning Fox Index, Simple Measure of "Gobbledygook" (SMOG) Index, Automated Readability Index and Coleman Liau Index).
    RESULTS: Responses provided by ChatGPT had significantly higher ratings of completeness (median (IQR) 3 (2-3) vs. 2 (1-3) and empathy 4 (3-5) vs. 2 (1-3) compared to physician responses, respectively (p < 0.001). Accuracy of healthcare information did not differ significantly (4 (3-4) vs. (4 (3-4), p = 0.752). Physician responses had significantly higher ease of readability and lower grade-level readability compared to ChatGPT (p < 0.001).
    CONCLUSION: Our results suggest higher empathy and completeness of information provided by ChatGPT compared to physicians. However, these responses are at readability levels higher than the literacy of the average American population. Future research could emphasize incorporating chatbot responses while drafting physician responses to provide more balanced information to healthcare questions.
    Keywords:  Artificial intelligence; Chatbot; Large language models; Social media; Vascular neurosurgery
    DOI:  https://doi.org/10.1016/j.clineuro.2025.108986
  9. Int J Impot Res. 2025 Jun 03.
      This study aims to evaluate and compare the performance of artificial intelligence chatbots by assessing the reliability and quality of the information they provide regarding penis enhancement (PE). Search trends for keywords related to PE were determined using Google Trends ( https://trends.google.com ) and Semrush ( https://www.semrush.com ). Data covering a ten-year period was analyzed, taking into account regional trends and changes in search volume. Based on these trends, 25 questions were selected and categorized into three groups: general information (GI), surgical treatment (ST) and myths/misconceptions (MM). These questions were posed to three advanced chatbots: ChatGPT-4, Gemini Pro and Llama 3.1. Responses from each model were analyzed for readability using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease Score (FRES), while the quality of the responses was evaluated using the Ensuring Quality Information for Patients (EQIP) tool and the Modified DISCERN Score. All chatbot responses exhibited difficulty in readability and understanding according to FKGL and FRES, with no statistically significant differences among them (FKGL: p = 0.167; FRES: p = 0.366). Llama achieved the highest median Modified DISCERN score (4 [IQR:1]), significantly outperforming ChatGPT (3 [IQR:0]) and Gemini (3 [IQR:2]) (p < 0.001). Pairwise comparisons showed no significant difference between ChatGPT and Gemini (p = 0.070), but Llama was superior to both (p < 0.001). In EQIP scores, Llama also scored highest (73.8 ± 2.2), significantly surpassing ChatGPT (68.7 ± 2.1) and Gemini (54.2 ± 1.3) (p < 0.001). Across categories, Llama consistently achieved higher EQIP scores (GI:71.1 ± 1.6; ST: 73.6 ± 4.1; MM: 76.3 ± 2.1) and Modified DISCERN scores (GI:4 [IQR:0]; ST:4 [IQR:1]; MM:3 [IQR:1]) compared to ChatGPT (EQIP: GI:68.4 ± 1.1; ST: 65.7 ± 2.2; MM:71.1 ± 1.7; Modified DISCERN: GI:3 [IQR:1]; ST:3 [IQR:1]; MM:3 [IQR:0]) and Gemini (EQIP: GI:55.2 ± 1.4; ST:55.2 ± 1.6; MM:2.6 ± 2.5; Modified DISCERN: GI:1 [IQR:2]; ST:1 [IQR:2]; MM:3 [IQR:0]) (p < 0.001). This study highlights Llama's superior reliability in providing PE-related health information, though all chatbots struggled with readability.
    DOI:  https://doi.org/10.1038/s41443-025-01098-3
  10. J Surg Educ. 2025 Jun 02. pii: S1931-7204(25)00128-X. [Epub ahead of print]82(8): 103547
       INTRODUCTION: YouTube has become a widely used resource for medical students and trainees seeking instruction in surgical skills, including suturing techniques. While accessible, YouTube lacks formal peer review, raising concerns about the quality and reliability of its educational content. This study analyzes the landscape of YouTube suturing videos, with a focus on content quality, instructor background, and engagement patterns.
    OBJECTIVE: To characterize YouTube's suturing instruction videos by evaluating the background of instructors, suture techniques demonstrated, and viewer engagement, in order to identify trends in educational content and areas for improvement.
    DESIGN: This cross-sectional study included 303 YouTube videos selected for English-language suturing instruction. Videos were categorized by the instructor's medical specialty and the suture patterns demonstrated. Engagement metrics, including view count and like count, were analyzed to assess viewer preferences and trends.
    RESULTS: Videos created by healthcare professionals, particularly those with surgical backgrounds, attracted the highest levels of engagement. Plastic and general surgeons produced 6 of the 10 most-viewed videos, while oral-maxillofacial surgeons had the highest median view count per video and neurosurgeons had the highest median like count. Fundamental suture patterns, such as subcuticular running, simple interrupted, and simple running, were the most popular, aligning with the needs of trainees. However, advanced suturing techniques were underrepresented, suggesting a gap in available instructional content.
    CONCLUSIONS: This study highlights the importance of healthcare professionals in producing high-quality YouTube content to meet the demand for reliable medical education. While fundamental techniques are well-represented, there is a need for comprehensive, high-quality tutorials on advanced suturing methods. By expanding educational content, medical professionals can enhance the quality of open-access surgical education for a global audience.
    Keywords:  YouTube; medical education; online education; surgical education; suturing techniques
    DOI:  https://doi.org/10.1016/j.jsurg.2025.103547
  11. Digit Health. 2025 Jan-Dec;11:11 20552076251346579
       Background: Atopic dermatitis (AD) is prevalent worldwide. People are increasingly obtaining health information through social videos. In China, the quality, reliability, understandability, and actionability of AD-related videos have yet to be fully studied.
    Objective: This study proposed to analyze AD-related videos on Chinese video-sharing platforms.
    Methods: Three keywords "" (atopic dermatitis) or "" (eczema) or "" (atopic eczema) were used on Bilibili and Douyin (Chinese version of TikTok) to retrieve videos from May 25 to November 25, 2024. Included videos' reliability was evaluated by the Journal of American Medical Association (JAMA) criteria and modified DISCERN (mDISCERN), quality was evaluated by Global Quality Scale (GQS), and understandability and actionability were evaluated by Patient Education Materials Assessment Tool (PEMAT). Spearman correlation was used to explore the correlation among different variables.
    Results: 368 videos were included, and the activity on Douyin was higher than that on Bilibili (P < .001). Although Douyin's AD videos were rated higher than Bilibili's in reliability and understandability (all P < .001), neither of them was satisfactory. Medical practitioners were the main source of videos (n = 326, 88.59%), mostly conveying treatment (n = 159, 43.21%). The videos they uploaded had higher ratings in the JAMA, mDISCERN and understandability, and were also more popular (all P < .05). There were more videos on Bilibili that involved traditional Chinese medicine (TCM) (n = 81, 52.60%); however, the ratings of such videos were lower than those without TCM on any platform (all P < .001). Videos about treatment and prevention were almost the best two categories in terms of actionability (all P < .05). The number of likes, shares, comments, collections, fans, the JAMA score, and understandability were positively correlated (all P < .05).
    Conclusions: The assessment of AD-related videos flooding social media is suboptimal. Videos from professionals are more reliable and should be promoted. The public should exercise caution when searching for healthcare information on video platforms.
    Keywords:  Digital health; chronic; eHealth; health informatics; public health; social media
    DOI:  https://doi.org/10.1177/20552076251346579
  12. J Ment Health. 2025 Jun 02. 1-6
       BACKGROUND: Depression is a common mental disorder worldwide. The internet offers a wide range of digital resources on depression, but its credibility is sometimes doubted due to the potential for erroneous information, which can worsen the stigma surrounding mental health and deter individuals from seeking professional services.
    AIMS: This paper aimed to analyze the contents of depression on Google and YouTube videos.
    METHODS: The readability, trustworthiness, understandability, and overall quality of the information were investigated using readability indexes, PEMAT-AV, NLM criteria for trustworthiness, and a self-structured questionnaire.
    RESULTS: A total of 85 websites and 80 YouTube videos on depression were evaluated. The National Work Group on Literacy and Health recommends that patient-oriented literature should be written at or below a sixth-grade level. However, 88% of the websites are written above a 9th-grade and are difficult to read. The majority of YouTube videos were from private agencies, in contrast to government agencies. Most content describes clinical symptoms, with 50% validating ICD/DSM criteria. However, less than 50% detailed onset, prognosis, or course of illness. Websites describe treatment modalities more frequently and have educational utility.
    CONCLUSIONS: There is a need for regulations on the dissemination of health-related information on the internet.
    Keywords:  Google; PEMAT-AV; YouTube; depression; quality of information; readability
    DOI:  https://doi.org/10.1080/09638237.2025.2512327
  13. BMC Gastroenterol. 2025 Jun 02. 25(1): 423
       BACKGROUND: Chronic pancreatitis is a chronic inflammatory disease of the pancreatic tissue caused by genetic or environmental factors; it has a complex etiology and is difficult to diagnose and treat clinically, severely affecting the physical and mental health of patients. Currently, the treatment of chronic pancreatitis often relies on lifelong self-health management by patients. However, the quality of videos related to chronic pancreatitis on short video platforms remains to be determined, and these videos may contain erroneous information that patients cannot recognize. This study aims to assess the quality of information in short videos related to chronic pancreatitis on the Chinese platforms TikTok and Bilibili.
    METHODS: Based on comprehensive rankings, the top 100 videos related to chronic pancreatitis on TikTok and Bilibili were searched, filtered, and evaluated by two independent gastroenterologists via the Global Quality Score and the improved DISCERN tool. The content of the videos was analyzed from six aspects: definition, symptoms, risk factors, diagnosis, treatment, and outcomes.
    RESULTS: A total of 112 videos related to chronic pancreatitis were collected, with the majority (80.36%) being from health professionals, including 20.55% from gastrointestinal health experts and 26.79% from pancreatic surgery specialists. The overall quality and reliability of the videos were relatively low, with DISCERN and GQS scores of 2 (IQR: 2-3) and 3 (IQR: 2-3), respectively. In comparison, videos from gastrointestinal health professionals were more comprehensive in covering chronic pancreatitis content and showed the highest reliability and quality, with DISCERN scores of 3 (IQR: 2-3) and GQS scores of 3 (IQR: 2-3).
    CONCLUSION: Overall, the content and quality of video information related to chronic pancreatitis on the two short video platforms in China still require improvement. In the future, health professionals need to provide high-quality videos to promote effective self-disease management among patients with chronic pancreatitis.
    Keywords:  Chronic pancreatitis; Information quality; Short video apps; Social media; TikTok
    DOI:  https://doi.org/10.1186/s12876-025-04005-8