bims-librar 2024-04-07 papers

bims-librar

Biomed News

on Biomedical librarianship

Issue of 2024–04–07
nineteen papers selected by
Thomas Krichel, Open Library Society

PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge.
Open-source large language models in action: A bioinformatics chatbot for PRIDE database.
Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study.
Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis.
Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement.
Evaluation of the safety, accuracy, and helpfulness of the GPT-4.0 Large Language Model in neurosurgery.
'Just Google it'-A scoping review of online mental health resources for survivors of breast cancer.
Evaluating Readability, Understandability, and Actionability of Online Printable Patient Education Materials for Cholesterol Management: A Systematic Review.
Assessment of the quality of online patient information resources for patients considering parastomal hernia treatment.
Evaluating the Readability of Patient Education Materials for Anterior Vertebral Body Tethering, Distraction-Based Methods, and Posterior Spinal Fusion for the Treatment of Pediatric Spinal Deformity.
Analysis of the Most Popular Online Ankle Fracture-Related Patient Education Materials.
Correction: YouTube online videos as a source for patient education of cervical spondylosis-a reliability and quality analysis.
Gynecomastia Surgery Patient Education: An Information Quality Assessment of YouTube Videos.
HSR24-165: Educational Efficacy of YouTube Videos on Multiple Myeloma.
Evaluation of the Quality and Reliability of YouTubeTM Videos Created by Orthodontists as an Information Source for Clear Aligners.
Contents analysis of thyroid cancer-related information uploaded to YouTube by physicians in Korea: endorsing thyroid cancer screening, potentially leading to overdiagnosis.
Assessing the educational value of laparoscopic radical nephrectomy videos on YouTube®: A comparative analysis of short versus long videos.
Modern internet search analytics and thyroidectomy: What are patients asking?
Using infographics in disseminating healthy lifestyle information on social media is likely to increase uptake and sharing.

Nucleic Acids Res. 2024 Apr 04. pii: gkae235. [Epub ahead of print]

PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge.

Chih-Hsuan Wei, Alexis Allot, Po-Ting Lai, Robert Leaman, Shubo Tian, Ling Luo, Qiao Jin, Zhizheng Wang, Qingyu Chen, Zhiyong Lu.

PubTator 3.0 (https://www.ncbi.nlm.nih.gov/research/pubtator3/) is a biomedical literature resource using state-of-the-art AI techniques to offer semantic and relation searches for key concepts like proteins, genetic variants, diseases and chemicals. It currently provides over one billion entity and relation annotations across approximately 36 million PubMed abstracts and 6 million full-text articles from the PMC open access subset, updated weekly. PubTator 3.0's online interface and API utilize these precomputed entity relations and synonyms to provide advanced search capabilities and enable large-scale analyses, streamlining many complex information needs. We showcase the retrieval quality of PubTator 3.0 using a series of entity pair queries, demonstrating that PubTator 3.0 retrieves a greater number of articles than either PubMed or Google Scholar, with higher precision in the top 20 results. We further show that integrating ChatGPT (GPT-4) with PubTator APIs dramatically improves the factuality and verifiability of its responses. In summary, PubTator 3.0 offers a comprehensive set of features and tools that allow researchers to navigate the ever-expanding wealth of biomedical literature, expediting research and unlocking valuable insights for scientific discovery.

DOI: https://doi.org/10.1093/nar/gkae235
Proteomics. 2024 Mar 31. e2400005

Open-source large language models in action: A bioinformatics chatbot for PRIDE database.

Jingwen Bai, Selvakumar Kamatchinathan, Deepti J Kundu, Chakradhar Bandla, Juan Antonio Vizcaíno, Yasset Perez-Riverol.

  We here present a chatbot assistant infrastructure (https://www.ebi.ac.uk/pride/chatbot/) that simplifies user interactions with the PRIDE database's documentation and dataset search functionality. The framework utilizes multiple Large Language Models (LLM): llama2, chatglm, mixtral (mistral), and openhermes. It also includes a web service API (Application Programming Interface), web interface, and components for indexing and managing vector databases. An Elo-ranking system-based benchmark component is included in the framework as well, which allows for evaluating the performance of each LLM and for improving PRIDE documentation. The chatbot not only allows users to interact with PRIDE documentation but can also be used to search and find PRIDE datasets using an LLM-based recommendation system, enabling dataset discoverability. Importantly, while our infrastructure is exemplified through its application in the PRIDE database context, the modular and adaptable nature of our approach positions it as a valuable tool for improving user experiences across a spectrum of bioinformatics and proteomics tools and resources, among other domains. The integration of advanced LLMs, innovative vector-based construction, the benchmarking framework, and optimized documentation collectively form a robust and transferable chatbot assistant infrastructure. The framework is open-source (https://github.com/PRIDE-Archive/pride-chatbot).

Keywords:  bioinformatics; dataset discoverability; documentation; large language models; proteomics; public data; software architectures; training

DOI:  https://doi.org/10.1002/pmic.202400005
J Med Internet Res. 2024 Apr 05. 26 e52935

Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study.

Joseph Mugaanyi, Liuying Cai, Sumei Cheng, Caide Lu, Jing Huang.

   BACKGROUND: Large language models (LLMs) have gained prominence since the release of ChatGPT in late 2022.
OBJECTIVE: The aim of this study was to assess the accuracy of citations and references generated by ChatGPT (GPT-3.5) in two distinct academic domains: the natural sciences and humanities.
METHODS: Two researchers independently prompted ChatGPT to write an introduction section for a manuscript and include citations; they then evaluated the accuracy of the citations and Digital Object Identifiers (DOIs). Results were compared between the two disciplines.
RESULTS: Ten topics were included, including 5 in the natural sciences and 5 in the humanities. A total of 102 citations were generated, with 55 in the natural sciences and 47 in the humanities. Among these, 40 citations (72.7%) in the natural sciences and 36 citations (76.6%) in the humanities were confirmed to exist (P=.42). There were significant disparities found in DOI presence in the natural sciences (39/55, 70.9%) and the humanities (18/47, 38.3%), along with significant differences in accuracy between the two disciplines (18/55, 32.7% vs 4/47, 8.5%). DOI hallucination was more prevalent in the humanities (42/55, 89.4%). The Levenshtein distance was significantly higher in the humanities than in the natural sciences, reflecting the lower DOI accuracy.
CONCLUSIONS: ChatGPT's performance in generating citations and references varies across disciplines. Differences in DOI standards and disciplinary nuances contribute to performance variations. Researchers should consider the strengths and limitations of artificial intelligence writing tools with respect to citation accuracy. The use of domain-specific models may enhance accuracy.

Keywords:  AI; ChatGPT; GPT-3.5; LLMs; NLP; academic discourse; academic writing; accuracy; artificial intelligence; chatbot; citations; cross-disciplinary evaluation; humanities; large language models; machine learning algorithms; natural language processing; natural science; references; scholarly; scholarly writing; writing tool

DOI:  https://doi.org/10.2196/52935
J Med Syst. 2024 Apr 03. 48(1): 38

Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis.

Mehmet Fatih Şahin, Hüseyin Ateş, Anıl Keleş, Rıdvan Özcan, Çağrı Doğan, Murat Akgül, Cenk Murat Yazıcı.

  The aim of the study is to evaluate and compare the quality and readability of responses generated by five different artificial intelligence (AI) chatbots-ChatGPT, Bard, Bing, Ernie, and Copilot-to the top searched queries of erectile dysfunction (ED). Google Trends was used to identify ED-related relevant phrases. Each AI chatbot received a specific sequence of 25 frequently searched terms as input. Responses were evaluated using DISCERN, Ensuring Quality Information for Patients (EQIP), and Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FKRE) metrics. The top three most frequently searched phrases were "erectile dysfunction cause", "how to erectile dysfunction," and "erectile dysfunction treatment." Zimbabwe, Zambia, and Ghana exhibited the highest level of interest in ED. None of the AI chatbots achieved the necessary degree of readability. However, Bard exhibited significantly higher FKRE and FKGL ratings (p = 0.001), and Copilot achieved better EQIP and DISCERN ratings than the other chatbots (p = 0.001). Bard exhibited the simplest linguistic framework and posed the least challenge in terms of readability and comprehension, and Copilot's text quality on ED was superior to the other chatbots. As new chatbots are introduced, their understandability and text quality increase, providing better guidance to patients.

Keywords:  Artificial intelligence; Bard; Bing; ChatGPT; Chatbot; Copilot; Erectile dysfunction; Ernie bot

DOI:  https://doi.org/10.1007/s10916-024-02056-0
Knee Surg Relat Res. 2024 Apr 02. 36(1): 15

Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement.

Siyuan Zhang, Zi Qiang Glen Liau, Kian Loong Melvin Tan, Wei Liang Chua.

   BACKGROUND: Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR).
METHODS: A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale.
RESULTS: Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance.
CONCLUSION: ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.

Keywords:  Artificial intelligence; ChatGPT; Chatbot; Large language model; Total knee arthroplasty; Total knee replacement

DOI:  https://doi.org/10.1186/s43019-024-00218-5
J Clin Neurosci. 2024 Apr 03. pii: S0967-5868(24)00123-1. [Epub ahead of print]123 151-156

Evaluation of the safety, accuracy, and helpfulness of the GPT-4.0 Large Language Model in neurosurgery.

Kevin T Huang, Neel H Mehta, Saksham Gupta, Alfred P See, Omar Arnaout.

   BACKGROUND: Although prior work demonstrated the surprising accuracy of Large Language Models (LLMs) on neurosurgery board-style questions, their use in day-to-day clinical situations warrants further investigation. This study assessed GPT-4.0's responses to common clinical questions across various subspecialties of neurosurgery.
METHODS: A panel of attending neurosurgeons formulated 35 general neurosurgical questions spanning neuro-oncology, spine, vascular, functional, pediatrics, and trauma. All questions were input into GPT-4.0 with a prespecified, standard prompt. Responses were evaluated by two attending neurosurgeons, each on a standardized scale for accuracy, safety, and helpfulness. Citations were indexed and evaluated against identifiable database references.
RESULTS: GPT-4.0 responses were consistent with current medical guidelines and accounted for recent advances in the field 92.8 % and 78.6 % of the time respectively. Neurosurgeons reported GPT-4.0 responses providing unrealistic information or potentially risky information 14.3 % and 7.1 % of the time respectively. Assessed on 5-point scales, responses suggested that GPT-4.0 was clinically useful (4.0 ± 0.6), relevant (4.7 ± 0.3), and coherent (4.9 ± 0.2). The depth of clinical responses varied (3.7 ± 0.6), and "red flag" symptoms were missed 7.1 % of the time. Moreover, GPT-4.0 cited 86 references (2.46 citations per answer), of which only 50 % were deemed valid, and 77.1 % of responses contained at least one inappropriate citation.
CONCLUSION: Current general LLM technology can offer generally accurate, safe, and helpful neurosurgical information, but may not fully evaluate medical literature or recent field advances. Citation generation and usage remains unreliable. As this technology becomes more ubiquitous, clinicians will need to exercise caution when dealing with it in practice.

Keywords:  Consultation; GPT-4.0; LLMs; Neurosurgery

DOI:  https://doi.org/10.1016/j.jocn.2024.03.021
Psychooncology. 2024 Apr;33(4): e6337

'Just Google it'-A scoping review of online mental health resources for survivors of breast cancer.

Natalie Tuckey, Matthew Iasiello, Nadia Corsini, Bogda Koczwara, Monique Bareham, Amy Wellalagodage, Hannah R Wardill.

   OBJECTIVE: As the Internet is a ubiquitous resource for information, we aimed to replicate a patient's Google search to identify and assess the quality of online mental health/wellbeing materials available to support women living with or beyond cancer.
METHODS: A Google search was performed using a key term search strategy including search strings 'cancer', 'wellbeing', 'distress' and 'resources' to identify online resources of diverse formats (i.e., factsheet, website, program, course, video, webinar, e-book, podcast). The quality evaluation scoring tool (QUEST) was used to analyse the quality of health information provided.
RESULTS: The search strategy resulted in 283 resources, 117 of which met inclusion criteria across four countries: Australia, USA, UK, and Canada. Websites and factsheets were primarily retrieved. The average QUEST score was 10.04 (highest possible score is 28), indicating low quality, with 92.31% of resources lacking references to sources of information.
CONCLUSIONS: Our data indicated a lack of evidence-based support resources and engaging information available online for people living with or beyond cancer. The majority of online resources were non-specific to breast cancer and lacked authorship and attribution.

Keywords:  Internet; cancer; eHealth; mental health; oncology; online; support; survivorship; wellbeing

DOI:  https://doi.org/10.1002/pon.6337
J Am Heart Assoc. 2024 Apr 03. e030140

Evaluating Readability, Understandability, and Actionability of Online Printable Patient Education Materials for Cholesterol Management: A Systematic Review.

Chaitanya Bhatt, Ethan Lin, Laura E Ferreira-Legere, Cynthia A Jackevicius, Dennis T Ko, Douglas S Lee, Kathryn Schade, Sharon Johnston, Todd J Anderson, Jacob A Udell.

   BACKGROUND: Dyslipidemia management is a cornerstone in cardiovascular disease prevention and relies heavily on patient adherence to lifestyle modifications and medications. Numerous cholesterol patient education materials are available online, but it remains unclear whether these resources are suitable for the majority of North American adults given the prevalence of low health literacy. This review aimed to (1) identify printable cholesterol patient education materials through an online search, and (2) evaluate the readability, understandability, and actionability of each resource to determine its utility in practice.
METHODS AND RESULTS: We searched the MEDLINE database for peer-reviewed educational materials and the websites of Canadian and American national health organizations for gray literature. Readability was measured using the Flesch-Kincaid Grade Level, and scores between fifth- and sixth-grade reading levels were considered adequate. Understandability and actionability were scored using the Patient Education Materials Assessment Tool and categorized as superior (>80%), adequate (50%-70%), or inadequate (<50%). Our search yielded 91 results that were screened for eligibility. Among the 22 educational materials included in the study, 15 were identified through MEDLINE, and 7 were from websites. The readability across all materials averaged an 11th-grade reading level (Flesch-Kincaid Grade Level=11.9±2.59). The mean±SD understandability and actionability scores were 82.8±6.58% and 40.9±28.60%, respectively.
CONCLUSIONS: The readability of online cholesterol patient education materials consistently exceeds the health literacy level of the average North American adult. Many resources also inadequately describe action items for individuals to self-manage their cholesterol, representing an implementation gap in cardiovascular disease prevention.

Keywords:  cholesterol; dyslipidemia; patient education; statins

DOI:  https://doi.org/10.1161/JAHA.123.030140
Colorectal Dis. 2024 Apr 01.

Assessment of the quality of online patient information resources for patients considering parastomal hernia treatment.

Sue Blackwell, Scott Clifford, Thomas Pinkney, Dean Thompson, Jonathan Mathers.

   AIM: The aim was to examine the quality of online patient information resources for patients considering parastomal hernia treatment.
METHODS: A Google search was conducted using lay search terms for patient facing sources on parastomal hernia. The quality of the content was assessed using the validated DISCERN instrument. Readability of written content was established using the Flesch-Kincaid score. Sources were also assessed against the essential content and process standards from the National Institute for Health and Care Excellence (NICE) framework for shared decision making support tools. Content analysis was also undertaken to explore what the sources covered and to identify any commonalities across the content.
RESULTS: Fourteen sources were identified and assessed using the identified tools. The mean Flesch-Kincaid reading ease score was 43.61, suggesting that the information was difficult to read. The overall quality of the identified sources was low based on the pooled analysis of the DISCERN and Flesch-Kincaid scores, and when assessed against the criteria in the NICE standards framework for shared decision making tools. Content analysis identified eight categories encompassing 59 codes, which highlighted considerable variation between sources.
CONCLUSIONS: The current information available to patients considering parastomal hernia treatment is of low quality and often does not contain enough information on treatment options for patients to be able to make an informed decision about the best treatment for them. There is a need for high-quality information, ideally co-produced with patients, to provide patients with the necessary information to allow them to make informed decisions about their treatment options when faced with a symptomatic parastomal hernia.

Keywords:  parastomal hernia; patient information; shared decision making

DOI:  https://doi.org/10.1111/codi.16959
Int J Spine Surg. 2024 Apr 04. pii: 8591. [Epub ahead of print]

Evaluating the Readability of Patient Education Materials for Anterior Vertebral Body Tethering, Distraction-Based Methods, and Posterior Spinal Fusion for the Treatment of Pediatric Spinal Deformity.

Ari R Berg, Adam N Fano, Jacob Ball, Matthew J Weintraub, Michael W Fields, Ashok Para, Folorunsho Edobor-Osula, Alice Chu, Michael Vives, Neil Kaushal.

   BACKGROUND: The Internet is an important source of information for patients, but its effectiveness relies on the readability of its content. Patient education materials (PEMs) should be written at or below a sixth-grade reading level as outlined by agencies such as the American Medical Association. This study assessed PEMs' readability for the novel anterior vertebral body tethering (AVBT), distraction-based methods, and posterior spinal fusion (PSF) in treating pediatric spinal deformity.
METHODS: An online search identified PEMs using the terms "anterior vertebral body tethering," "growing rods scoliosis," and "posterior spinal fusion pediatric scoliosis." We selected the first 20 general medical websites (GMWs) and 10 academic health institution websites (AHIWs) discussing each treatment (90 websites total). Readability tests for each webpage were conducted using Readability Studio software. Reading grade levels (RGLs), which correspond to the US grade at which one is expected to comprehend the text, were calculated for sources and independent t tests compared with RGLs between treatment types.
RESULTS: The mean RGL was 12.1 ± 2.0. No articles were below a sixth-grade reading level, with only 2.2% at the sixth-grade reading level. AVBT articles had a higher RGL than distraction-based methods (12.7 ± 1.6 vs 11.9 ± 1.9, P = 0.082) and PSF (12.7 ± 1.6 vs 11.6 ± 2.3, P = 0.032). Materials for distraction-based methods and PSF were comparable (11.9 ± 1.9 vs 11.6 ± 2.3, P = 0.566). Among GMWs, AVBT materials had a higher RGL than distraction-based methods (12.9 ± 1.4 vs 12.1 ± 1.8, P = 0.133) and PSF (12.9 ± 1.4 vs 11.4 ± 2.4, P = 0.016).
CLINICAL RELEVANCE: Patients' health literacy is important for shared decision-making. Assessing the readability of scoliosis treatment PEMs guides physicians when sharing resources and discussing treatment with patients.
CONCLUSION: Both GMWs and AHIWs exceed recommended RGLs, which may limit patient and parent understanding. Within GMWs, AVBT materials are written at a higher RGL than other treatments, which may hinder informed decision-making and patient outcomes. Efforts should be made to create online resources at the appropriate RGL. At the very least, patients and parents may be directed toward AHIWs; RGLs are more consistent.
LEVEL OF EVIDENCE: 3:

Keywords:  AVBT; distraction; education; fusion; readability

DOI:  https://doi.org/10.14444/8591
Foot Ankle Orthop. 2024 Apr;9(2): 24730114241241310

Analysis of the Most Popular Online Ankle Fracture-Related Patient Education Materials.

Haad A Arif, Gavin LeBrun, Simon T Moore, David A Friscia.

   Background: Given the increasing accessibility of Internet access, it is critical to ensure that the informational material available online for patient education is both accurate and readable to promote a greater degree of health literacy. This study sought to investigate the quality and readability of the most popular online resources for ankle fractures.
Methods: After conducting a Google search using 6 terms related to ankle fractures, we collected the first 20 nonsponsored results for each term. Readability was evaluated using the Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (FKGL), and Gunning Fog Index (GFI) instruments. Quality was evaluated using custom created Ankle Fracture Index (AFI).
Results: A total of 46 of 120 articles met the inclusion criteria. The mean FKGL, FRE, and GFI scores were 8.4 ± 0.5, 57.5 ± 3.2, and 10.5 ± 0.5, respectively. The average AFI score was 15.4 ± 1.4, corresponding to an "acceptable" quality rating. Almost 70% of articles (n = 32) were written at or below the recommended eighth-grade reading level. Most articles discussed the need for imaging in diagnosis and treatment planning while neglecting to discuss the risks of surgery or potential future operations.
Conclusion: We found that online patient-facing materials on ankle fractures demonstrated an eighth-grade average reading grade level and an acceptable quality on content analysis. Further work should surround increasing information regarding risk factors, complications for surgery, and long-term recovery while ensuring that readability levels remain below at least the eighth-grade level.

Keywords:  ankle fracture; health literacy; patient education; readability

DOI:  https://doi.org/10.1177/24730114241241310
BMC Public Health. 2024 Apr 05. 24(1): 965

Correction: YouTube online videos as a source for patient education of cervical spondylosis-a reliability and quality analysis.

Wang Hong, Yan Chunyi, Wu Tingkui, Zhang Xiang, He Junbo, Liu Zhihao, Liu Hao.

DOI: https://doi.org/10.1186/s12889-023-16807-0
Ann Plast Surg. 2024 Feb 27.

Gynecomastia Surgery Patient Education: An Information Quality Assessment of YouTube Videos.

Praneet S Paidisetty, Leonard K Wang, Ashley Shin, Jacob Urbina, David Mitchell, Amy Quan, Chioma G Obinero, Wendy Chen.

BACKGROUND: YouTube is a platform for many topics, including plastic surgery. Previous studies have shown poor educational value in YouTube videos of plastic surgery procedures. The purpose of this study was to evaluate the quality and accuracy of YouTube videos concerning gynecomastia surgery (GS).
METHODS: The phrases "gynecomastia surgery" (GS) and "man boobs surgery" (MB) were queried on YouTube. The first 50 videos for each search term were examined. The videos were rated using our novel Gynecomastia Surgery Specific Score to measure gynecomastia-specific information, the Patient Education Materials Assessment Tool (PEMAT) to measure understandability and actionability, and the Global Quality Scale to measure general quality.
RESULTS: The most common upload source was a board-certified plastic surgeon (35%), and content category was surgery techniques and consultations (51%). Average scores for the Global Quality Scale (x̄ = 2.25), Gynecomastia Surgery Specific Score (x̄ = 3.50), and PEMAT Actionability (x̄ = 44.8%) were low, whereas PEMAT Understandability (x̄ = 77.4%) was moderate to high. There was no difference in all scoring modalities between the GS and MB groups. Internationally uploaded MB videos tended to originate from Asian countries, whereas GS videos tended to originate from non-US Western countries. Patient uploaders had higher PEMAT Actionability scores than plastic surgeon uploaders.
CONCLUSIONS: The quality and amount of gynecomastia-specific information in GS videos on YouTube are low and contain few practical, take-home points for patients. However, understandability is adequate. Plastic surgeons and professional societies should strive to create high-quality medical media on platforms such as YouTube.

DOI: https://doi.org/10.1097/SAP.0000000000003813
J Natl Compr Canc Netw. 2024 Apr 05. pii: HSR24-165. [Epub ahead of print]22(2.5):

HSR24-165: Educational Efficacy of YouTube Videos on Multiple Myeloma.

Vaishnavi Singh, Shiva Jashwanth Gaddam, Poornima Ramadas, Samip Master.

DOI: https://doi.org/10.6004/jnccn.2023.7149
Turk J Orthod. 2024 Mar 28. 37(1): 44-49

Evaluation of the Quality and Reliability of YouTubeTM Videos Created by Orthodontists as an Information Source for Clear Aligners.

Emre Cesur, Koray Tuncer, Duygu Sevgi, Barkın Cem Balaban, Can Arslan.

   Objective: This study aimed to evaluate the quality, reliability, and content usefulness of videos created by orthodontists on clear orthodontic aligners.
Methods: Videos were screened using YouTubeTM by conducting a search for "Invisalign". After a preliminary evaluation of the first 250 results, 61 videos that met the selection criteria were scored and their length, days since upload, and numbers of views, likes, dislikes, and comments were recorded. These data were used to calculate the interaction index and viewing rate. Video reliability was assessed using a five-item modified DISCERN index, and video quality was assessed using the Video Information and Quality Index. A 10-item content usefulness index was created to determine the usefulness of the video content. Descriptive statistics of the parameters were calculated, and correlation coefficients were calculated to evaluate the relationships between the parameters.
Results: The mean reliability score was 2.75±1.02 (out of 5), and the total quality score was 11.80±3.38 (out of 20). The total content usefulness index was quite low, with a mean score of 2.52±2.14 (out of 10). Interaction index and viewing rate were positively correlated with reliability score (r=0.463, p<0.01; r=0.295, p<0.05) and total quality score (r=0.365, p<0.01; r=0.295, p<0.01, respectively). The reliability score was positively correlated with the total quality score (r=0.842, p<0.01) and total content usefulness index (r=0.346, p<0.01).
Conclusion: Videos about orthodontic aligner treatment have average reliability and quality but largely insufficient content.

Keywords:  Clear aligners; YouTube; invisalign; invisible orthodontics; video quality

DOI:  https://doi.org/10.4274/TurkJOrthod.2023.2022.127
BMC Public Health. 2024 Apr 02. 24(1): 942

Contents analysis of thyroid cancer-related information uploaded to YouTube by physicians in Korea: endorsing thyroid cancer screening, potentially leading to overdiagnosis.

EunKyo Kang, HyoRim Ju, Soojeong Kim, Juyoung Choi.

   BACKGROUND: Thyroid cancer overdiagnosis is a major public health issue in South Korea, which has the highest incidence rate. The accessibility of information through the Internet, particularly on YouTube, could potentially impact excessive screening. This study aimed to analyze the content of thyroid cancer-related YouTube videos, particularly those from 2016 onwards, to evaluate the potential spread of misinformation.
METHODS: A total of 326 videos for analysis were collected using a video search protocol with the keyword "thyroid cancer" on YouTube. This study classified the selected YouTube videos as either provided by medical professionals or not and used topic clustering with LDA (latent dirichlet allocation), sentiment analysis with KoBERT (Korean bidirectional encoder representations from transformers), and reliability evaluation to analyze the content. The proportion of mentions of poor prognosis for thyroid cancer and the categorization of advertising content was also analyzed.
RESULTS: Videos by medical professionals were categorized into 7 topics, with "Thyroid cancer is not a 'Good cancer'" being the most common. The number of videos opposing excessive thyroid cancer screening decreased gradually yearly. Videos advocating screening received more favorable comments from viewers than videos opposing excessive thyroid cancer screening. Patient experience videos were categorized into 6 topics, with the "Treatment process and after-treatment" being the most common.
CONCLUSION: This study found that a significant proportion of videos uploaded by medical professionals on thyroid cancer endorse the practice, potentially leading to excessive treatments. The study highlights the need for medical professionals to provide high-quality and unbiased information on social media platforms to prevent the spread of medical misinformation and the need for criteria to judge the content and quality of online health information.

Keywords:  Overdiagnosis; Sentiment analysis; Thyroid cancer screening; YouTube

DOI:  https://doi.org/10.1186/s12889-024-18403-2
J Minim Access Surg. 2024 Mar 28.

Assessing the educational value of laparoscopic radical nephrectomy videos on YouTube®: A comparative analysis of short versus long videos.

Muharrem Baturu, Mehmet Öztürk, Ömer Bayrak, Sakıp Erturhan, Ilker Seckiner.

INTRODUCTION: To evaluate the quality of laparoscopic radical nephrectomy videos and determine the extent to which they are informative and educational for healthcare professionals.
PATIENTS AND METHODS: We used the YouTube® search engine to search for the term 'laparoscopic radical nephrectomy' with time filters of 4-20 min (Group 1) and >20 min (Group 2) and then sorted the results uploaded chronologically before January 2023. One hundred videos were analysed for each group. The reliability of the videos was assessed using the Journal of American Medical Association (JAMA) Benchmark Criteria and DISCERN questionnaire scores (DISCERN). Educational quality was assessed using the Global Quality Score (GQS) and a 20-item objective scoring system (OSS) for laparoscopic nephrectomy. The popularity of the videos was evaluated using the video power index (VPI).
RESULTS: The mean video duration was 8.9 ± 4.3 min in Group 1 and 52.02 ± 31.09 min in Group 2 (P < 0.001). The mean JAMA (2.49 ± 0.61) and OSS scores (60 ± 12.3) were higher in Group 2 than in Group 1, while no significant difference was observed in the mean GQS (2.53 ± 0.7, 2.39 ± 0.88, respectively) between the groups (P < 0.001, P = 0.039, P = 0.131, respectively).
CONCLUSION: While the standardisation of surgical videos published on YouTube® and the establishment of auditing mechanisms do not seem plausible, high total OSS, periprocedural OSS, and VPI scores, and high OSS, JAMAS, GQS and DISCERN scores in long videos indicate that such videos offer a greater contribution to education.

DOI: https://doi.org/10.4103/jmas.jmas_355_23
World J Otorhinolaryngol Head Neck Surg. 2024 Mar;10(1): 49-58

Modern internet search analytics and thyroidectomy: What are patients asking?

Neeraj Suresh, Christian Fritz, Emma De Ravin, Karthik Rajasekaran.

   Objectives: Thyroidectomy is among the most commonly performed head and neck surgeries, however, limited existing information is available on topics of interest and concern to patients.
Study Design: Observational.
Setting: Online.
Methods: A search engine optimization tool was utilized to extract metadata on Google-suggested questions that "People Also Ask" (PAA) pertaining to "thyroidectomy" and "thyroid surgery." These questions were categorized by Rothwell criteria and topics of interest. The Journal of the American Medical Association (JAMA) benchmark criteria enabled quality assessment.
Results: A total of 250 PAA questions were analyzed. Future-oriented PAA questions describing what to expect during and after the surgery on topics such as postoperative management, risks or complications of surgery, and technical details were significantly less popular among the "thyroid surgery" group (P < 0.001, P = 0.005, and P < 0.001, respectively). PAA questions about scarring and hypocalcemia were nearly threefold more popular than those related to pain (335 and 319 vs. 113 combined search engine response page count, respectively). The overall JAMA quality score remained low (2.50 ± 1.07), despite an increasing number of patients searching for "thyroidectomy" (r(77) = 0.30, P = 0.007).
Conclusions: Patients searching for the nonspecific term "thyroid surgery" received a curated collection of PAA questions that were significantly less likely to educate them on what to expect during and after surgery, as compared to patients with higher health literacy who search with the term "thyroidectomy." This suggests that the content of PAA questions differs based on the presumed health literacy of the internet user.

Keywords:  information quality; online health education; search analytics; thyroid surgery; thyroidectomy

DOI:  https://doi.org/10.1002/wjo2.117
Health Info Libr J. 2024 Mar 30.

Using infographics in disseminating healthy lifestyle information on social media is likely to increase uptake and sharing.

Sin Ting Chu, Dickson K W Chiu, Kevin K W Ho.

   BACKGROUND: Infographics facilitate rapid information dissemination with enriched eye-catching content on social media, but it is unclear what factors affect the adoption of information presented in this way.
OBJECTIVES: We tested whether the Information Acceptance Model applies to infographics on healthy lifestyle and fitness topics.
METHODS: Two hundred and four university students were invited to participate in an online survey on their acceptance after reading some healthy lifestyle and fitness topics infographics shared on social media. The data collected were analysed using Partial Least Square path modelling.
RESULTS: The results confirmed information usefulness as a predictor of information adoption; attitude towards information and information adoption were the predictors of behavioural intention. Information credibility and attitude towards information, but not information quality and needs, were significantly related to information usefulness. Social media usage and education level were factors affecting infographics impressions.
DISCUSSION: Results support most hypotheses. It confirms information usefulness as a predictor of infographics adoption. Attitudes towards information and information adoption are predictors of behavioural intentions of following healthy lifestyle and fitness suggestions through social media infographics.
CONCLUSION: Social media facilitates interpersonal communication, information exchange and knowledge sharing, and infographics may draw people into healthy lifestyle and fitness information items relevant to them.

Keywords:  attitude; consumer health information; health education; information dissemination; information literacy; public health; social media

DOI:  https://doi.org/10.1111/hir.12526