bims-librar 2024-08-18 papers

bims-librar

Biomed News

on Biomedical librarianship

Issue of 2024–08–18
23 papers selected by
Thomas Krichel, Open Library Society

Searchsmart.org: Guiding researchers to the best databases and search systems for systematic reviews and beyond.
Supporting the working life exposome: Annotating occupational exposure for enhanced literature search.
Scientific paper recommender system using deep learning and link prediction in citation network.
Evaluation of Medical Subject Headings assignment in simulated patient articles.
Scoping review search practices in the social sciences: A scoping review.
Glycoscience data content in the NCBI Glycans and PubChem.
Human-Comparable Sensitivity of Large Language Models in Identifying Eligible Studies Through Title and Abstract Screening: 3-Layer Strategy Using GPT-3.5 and GPT-4 for Systematic Reviews.
The utility of artificial intelligence platforms for patient-generated questions in Mohs micrographic surgery: a multi-national, blinded expert panel evaluation.
Analyzing the performance of ChatGPT in answering inquiries about cervical cancer.
ChatGPT-4: Alcohol use disorder responses.
Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.
Potential misinformation in websites on carpal tunnel syndrome.
Readability of Online Spanish Materials for Breast Reconstruction Using Deep Inferior Epigastric Perforator Flaps.
Google Trends-Assisted Analysis of the Readability, Accountability, and Accessibility of Online Patient Education Materials for the Treatment of AMD After US FDA Approval of Pegcetacoplan.
Evaluation of the Quality and Readability of Web-Based Information Regarding Foreign Bodies of the Ear, Nose, and Throat: Qualitative Content Analysis.
Quality of Video Content Related to Deep Inferior Epigastric Perforator Flap Breast Reconstruction: Social Media Platforms Versus Large Language Models.
Is it safe? Health promotion videos on YouTube and the safety of viewers - Views from Ghana.
YouTube as a source of recognizing acute stroke; progress in 2 years.
YouTube Videos Are a Moderately Comprehensive, Reliable, and Quality Option to Learn About "Multiple Sclerosis and Sexuality".
Diverticulosis and Diverticulitis on YouTube: Is Popular Information the Most Reliable?
Evaluation of treatment information quality on hypertension and diabetes on WeChat and TikTok: A cross-sectional content analysis.
Sociodemographic Factors Associated With Using eHealth for Information Seeking in the United States: Cross-Sectional Population-Based Study With 3 Time Points Using Health Information National Trends Survey Data.
Joint Extraction of Biomedical Events Based on Dynamic Path Planning Strategy and Hybrid Neural Network.

Res Synth Methods. 2024 Aug 11.

Searchsmart.org: Guiding researchers to the best databases and search systems for systematic reviews and beyond.

Michael Gusenbauer.

  When searching for scholarly documents, researchers often stick with the same familiar handful of databases. Yet, just beyond these limited horizons lie dozens of alternatives with which they could search more effectively, whether for quick lookups or thorough searches in systematic reviews or meta-analyses. Searchsmart.org is a free website that guides researchers to particularly suitable search options for their particular disciplines, offering a wide array of resources, including search engines, aggregators, journal platforms, repositories, clinical trials databases, bibliographic databases, and digital libraries. Search Smart currently evaluates the coverage and functionality of more than a hundred leading scholarly databases, including most major multidisciplinary databases and many that are discipline-specific. Search Smart's primary use cases involve database-selection decisions as part of systematic reviews, meta-analyses, or bibliometric analyses. Researchers can use up to 583 criteria to filter and sort recommendations of databases and the interfaces through which they can be accessed for user-friendliness, search rigor, or relevance. With specific pre-defined filter settings, researchers can quickly identify particularly suitable databases for Boolean keyword searching and forward or backward citation searching. Overall, Search Smart's recommendations help researchers to discover knowledge more effectively and efficiently by selecting the more suitable databases for their tasks.

Keywords:  bibliometric analyses; comparison of bibliographic databases; database selection; metamorphic testing; systematic reviews and meta‐analyses; systematic searching

DOI:  https://doi.org/10.1002/jrsm.1746
PLoS One. 2024 ;19(8): e0307844

Supporting the working life exposome: Annotating occupational exposure for enhanced literature search.

Paul Thompson, Sophia Ananiadou, Ioannis Basinas, Bendik C Brinchmann, Christine Cramer, Karen S Galea, Calvin Ge, Panagiotis Georgiadis, Jorunn Kirkeleit, Eelco Kuijpers, Nhung Nguyen, Roberto Nuñez, Vivi Schlünssen, Zara Ann Stokholm, Evana Amir Taher, Håkan Tinnerberg, Martie Van Tongeren, Qianqian Xie.

An individual's likelihood of developing non-communicable diseases is often influenced by the types, intensities and duration of exposures at work. Job exposure matrices provide exposure estimates associated with different occupations. However, due to their time-consuming expert curation process, job exposure matrices currently cover only a subset of possible workplace exposures and may not be regularly updated. Scientific literature articles describing exposure studies provide important supporting evidence for developing and updating job exposure matrices, since they report on exposures in a variety of occupational scenarios. However, the constant growth of scientific literature is increasing the challenges of efficiently identifying relevant articles and important content within them. Natural language processing methods emulate the human process of reading and understanding texts, but in a fraction of the time. Such methods can increase the efficiency of both finding relevant documents and pinpointing specific information within them, which could streamline the process of developing and updating job exposure matrices. Named entity recognition is a fundamental natural language processing method for language understanding, which automatically identifies mentions of domain-specific concepts (named entities) in documents, e.g., exposures, occupations and job tasks. State-of-the-art machine learning models typically use evidence from an annotated corpus, i.e., a set of documents in which named entities are manually marked up (annotated) by experts, to learn how to detect named entities automatically in new documents. We have developed a novel annotated corpus of scientific articles to support machine learning based named entity recognition relevant to occupational substance exposures. Through incremental refinements to the annotation process, we demonstrate that expert annotators can attain high levels of agreement, and that the corpus can be used to train high-performance named entity recognition models. The corpus thus constitutes an important foundation for the wider development of natural language processing tools to support the study of occupational exposures.

DOI: https://doi.org/10.1371/journal.pone.0307844
Heliyon. 2024 Jul 30. 10(14): e34685

Scientific paper recommender system using deep learning and link prediction in citation network.

Weijuan Li.

  Today, the number of published scientific articles is increasing day by day, and this has made the process of searching for articles more difficult. The need to provide specific recommender systems (RSs) for suggesting scientific articles is strongly felt in this situation. Because searching for articles based only on matching the titles or content of other articles is not an efficient process. In this research, the combination of two content analysis and citation network is used to design an RS for scientific articles (RECSA). In RECSA, natural language processing and deep learning techniques are used to process the titles and extract the content attributes of the articles. For this purpose, first, the titles of the articles are pre-processed, and by using the Term Frequency Inverse Document Frequency (TF-IDF) criterion, the importance of each word in the title is estimated. Then the dimensions of the obtained attributes are reduced by using a convolutional neural network (CNN). Then, by using the cosine similarity criterion, the content similarity matrix of the articles is calculated based on the attribute vectors. Also, the link prediction approach is used to analyze the connections of scientific articles' citation network. Finally, in the third step of RECSA, the two similarity matrices calculated in the previous steps are combined using an influence coefficient parameter to obtain the final similarity matrix, and the recommendation operation is based on the highest similarity value. The efficiency of RECSA has been evaluated from different aspects and the results have been compared with previous works. According to the results, utilizing the combination of TF-IDF and CNN for analyzing content-based features, leads to at least 0.32 % improvement in terms of precision compared to previous works. Also, by integrating citation and content-based data, the precision of first suggestion in RECSA would be 99.01 % which indicates the minimum improvement of 0.9 % compared to compared methods. The results show that by using RECSA, the recommendation can be done with higher accuracy and efficiency.

Keywords:  Citation network of scientific papers; Content-based paper recommendation; Recommender system (RS); Text processing

DOI:  https://doi.org/10.1016/j.heliyon.2024.e34685
Int J Pharm Pract. 2024 Aug 14. pii: riae042. [Epub ahead of print]

Evaluation of Medical Subject Headings assignment in simulated patient articles.

Fernanda S Tonin, Luciana G Negrão, Isabela P Meza, Fernando Fernandez-Llimos.

   OBJECTIVES: To evaluate human-based Medical Subject Headings (MeSH) allocation in articles about 'patient simulation'-a technique that mimics real-life patient scenarios with controlled patient responses.
METHODS: A validation set of articles indexed before the Medical Text Indexer-Auto implementation (in 2019) was created with 150 combinations potentially referring to 'patient simulation'. Articles were classified into four categories of simulation studies. Allocation of seven MeSH terms (Simulation Training, Patient Simulation, High Fidelity Simulation Training, Computer Simulation, Patient-Specific Modelling, Virtual Reality, and Virtual Reality Exposure Therapy) was investigated. Accuracy metrics (sensitivity, precision, or positive predictive value) were calculated for each category of studies.
KEY FINDINGS: A set of 7213 articles was obtained from 53 different word combinations, with 2634 excluded as irrelevant. 'Simulated patient' and 'standardized/standardized patient' were the most used terms. The 4579 included articles, published in 1044 different journals, were classified into: 'Machine/Automation' (8.6%), 'Education' (75.9%) and 'Practice audit' (11.4%); 4.1% were 'Unclear'. Articles were indexed with a median of 10 MeSH (IQR 8-13); however, 45.5% were not indexed with any of the seven MeSH terms. Patient Simulation was the most prevalent MeSH (24.0%). Automation articles were more associated with Computer Simulation MeSH (sensitivity = 54.5%; precision = 25.1%), while Education articles were associated with Patient Simulation MeSH (sensitivity = 40.2%; precision = 80.9%). Practice audit articles were also polarized to Patient Simulation MeSH (sensitivity = 34.6%; precision = 10.5%).
CONCLUSIONS: Inconsistent use of free-text words related to patient simulation was observed, as well as inaccuracies in human-based MeSH assignments. These limitations can compromise relevant literature retrieval to support evidence synthesis exercises.

Keywords:  MEDLINE; Medical Subject Headings; National Library of Medicine; bibliometrics; patient simulation

DOI:  https://doi.org/10.1093/ijpp/riae042
Res Synth Methods. 2024 Aug 12.

Scoping review search practices in the social sciences: A scoping review.

Judith Logan, Jenaya Webb, Nalini K Singh, Nailisa Tanner, Kathryn Barrett, Margaret Wall, Benjamin Walsh, Ana Patricia Ayala.

  A thorough literature search is a key feature of scoping reviews. We investigated the search practices used by social science researchers as reported in their scoping reviews. We collected scoping reviews published between 2015 and 2021 from Social Science Citation Index. In the 2484 included studies, we observed a 58% average annual increase in published reviews, primarily from clinical and applied social science disciplines. Bibliographic databases comprised most of the information sources in the primary search strategy (n = 9565, 75%), although reporting practices varied. Most scoping reviews (n = 1805, 73%) included at least one supplementary search strategy. A minority of studies (n = 713, 29%) acknowledged an LIS professional and few listed one as a co-author (n = 194, 8%). We conclude that to improve reporting and strengthen the impact of the scoping review method in the social sciences, researchers should consider (1) adhering to PRISMA-S reporting guidelines, (2) employing more supplementary search strategies, and (3) collaborating with LIS professionals.

Keywords:  information retrieval; knowledge synthesis; librarians and information professionals; scoping reviews; searching; social sciences

DOI:  https://doi.org/10.1002/jrsm.1742
Anal Bioanal Chem. 2024 Aug 12.

Glycoscience data content in the NCBI Glycans and PubChem.

Sunghwan Kim, Jian Zhang, Tiejun Cheng, Qingliang Li, Evan E Bolton.

  Studying glycans and their functions in the body aids in the understanding of disease mechanisms and developing new treatments. This necessitates resources that provide comprehensive glycan data integrated with relevant information from other scientific fields such as genomics, genetics, proteomics, metabolomics, and chemistry. The present paper describes two resources at the U.S. National Center for Biotechnology Information (NCBI), the NCBI Glycans and PubChem, which provide glycan-related information useful for the glycoscience research community. The NCBI Glycans ( https://www.ncbi.nlm.nih.gov/glycans/ ) is a dedicated website for glycobiology data content at NCBI and provides quick access to glycan-related information scattered across multiple NCBI databases as well as other information resources external to NCBI. Importantly, the NCBI Glycans hosts the official web page for the symbol nomenclature for glycans (SNFG), which is the standard graphical representation of glycan structures recommended for scientific publication. On the other hand, PubChem ( https://pubchem.ncbi.nlm.nih.gov ) is a research-focused, large-scale public chemical database, containing a substantial number of glycan-containing records and is integrated with important glycoscience resources like GlyTouCan, GlyCosmos, and GlyGen. PubChem organizes glycan-related information within multiple data collections (i.e., Substance, Compound, Protein, Gene, Pathway, and Taxonomy) and provides various tools and services that allow users to access them both interactively through a web browser and programmatically through a REST-ful interface, including PUG-View. The NCBI Glycans and PubChem highlight glycan-related data and improve their accessibility, helping scientists exploit these data in their research.

Keywords:  GlyCosmos; GlyGen; GlyTouCan; Glycan; PUG-View; PubChem

DOI:  https://doi.org/10.1007/s00216-024-05459-7
J Med Internet Res. 2024 Aug 16. 26 e52758

Human-Comparable Sensitivity of Large Language Models in Identifying Eligible Studies Through Title and Abstract Screening: 3-Layer Strategy Using GPT-3.5 and GPT-4 for Systematic Reviews.

Kentaro Matsui, Tomohiro Utsumi, Yumi Aoki, Taku Maruki, Masahiro Takeshima, Yoshikazu Takaesu.

   BACKGROUND: The screening process for systematic reviews is resource-intensive. Although previous machine learning solutions have reported reductions in workload, they risked excluding relevant papers.
OBJECTIVE: We evaluated the performance of a 3-layer screening method using GPT-3.5 and GPT-4 to streamline the title and abstract-screening process for systematic reviews. Our goal is to develop a screening method that maximizes sensitivity for identifying relevant records.
METHODS: We conducted screenings on 2 of our previous systematic reviews related to the treatment of bipolar disorder, with 1381 records from the first review and 3146 from the second. Screenings were conducted using GPT-3.5 (gpt-3.5-turbo-0125) and GPT-4 (gpt-4-0125-preview) across three layers: (1) research design, (2) target patients, and (3) interventions and controls. The 3-layer screening was conducted using prompts tailored to each study. During this process, information extraction according to each study's inclusion criteria and optimization for screening were carried out using a GPT-4-based flow without manual adjustments. Records were evaluated at each layer, and those meeting the inclusion criteria at all layers were subsequently judged as included.
RESULTS: On each layer, both GPT-3.5 and GPT-4 were able to process about 110 records per minute, and the total time required for screening the first and second studies was approximately 1 hour and 2 hours, respectively. In the first study, the sensitivities/specificities of the GPT-3.5 and GPT-4 were 0.900/0.709 and 0.806/0.996, respectively. Both screenings by GPT-3.5 and GPT-4 judged all 6 records used for the meta-analysis as included. In the second study, the sensitivities/specificities of the GPT-3.5 and GPT-4 were 0.958/0.116 and 0.875/0.855, respectively. The sensitivities for the relevant records align with those of human evaluators: 0.867-1.000 for the first study and 0.776-0.979 for the second study. Both screenings by GPT-3.5 and GPT-4 judged all 9 records used for the meta-analysis as included. After accounting for justifiably excluded records by GPT-4, the sensitivities/specificities of the GPT-4 screening were 0.962/0.996 in the first study and 0.943/0.855 in the second study. Further investigation indicated that the cases incorrectly excluded by GPT-3.5 were due to a lack of domain knowledge, while the cases incorrectly excluded by GPT-4 were due to misinterpretations of the inclusion criteria.
CONCLUSIONS: Our 3-layer screening method with GPT-4 demonstrated acceptable level of sensitivity and specificity that supports its practical application in systematic review screenings. Future research should aim to generalize this approach and explore its effectiveness in diverse settings, both medical and nonmedical, to fully establish its use and operational feasibility.

Keywords:  GPT-3.5; GPT-4; artificial intelligence; information science; language model; library science; meta-analysis; prompt engineering; screening; systematic review

DOI:  https://doi.org/10.2196/52758
Int J Dermatol. 2024 08 09.

The utility of artificial intelligence platforms for patient-generated questions in Mohs micrographic surgery: a multi-national, blinded expert panel evaluation.

Kyle C Lauck, Seo Won Cho, Matthew DaCunha, John Wuennenberg, Sumaira Asai, Murad Alam, Sarah T Arron, Anna Bar, David G Brodland, Felipe B Cerci, Joel L Cohen, Brett Coldiron, M Laurin Council, Christopher B Harmon, George Hruza, Severin Läuchli, Brent R Moody, Ashley S Wysong, John A Zitelli, Stanislav N Tolkachjov.

   BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) transform how patients inform themselves. LLMs offer potential as educational tools, but their quality depends upon the information generated. Current literature examining AI as an informational tool in dermatology has been limited in evaluating AI's multifaceted roles and diversity of opinions. Here, we evaluate LLMs as a patient-educational tool for Mohs micrographic surgery (MMS) in and out of the clinic utilizing an international expert panel.
METHODS: The most common patient MMS questions were extracted from Google and transposed into two LLMs and Google's search engine. 15 MMS surgeons evaluated the generated responses, examining their appropriateness as a patient-facing informational platform, sufficiency of response in a clinical environment, and accuracy of content generated. Validated scales were employed to assess the comprehensibility of each response.
RESULTS: The majority of reviewers deemed all LLM responses appropriate. 75% of responses were rated as mostly accurate or higher. ChatGPT had the highest mean accuracy. The majority of the panel deemed 33% of responses sufficient for clinical practice. The mean comprehensibility scores for all platforms indicated a required 10th-grade reading level.
CONCLUSIONS: LLM-generated responses were rated as appropriate patient informational sources and mostly accurate in their content. However, these platforms may not provide sufficient information to function in a clinical environment, and complex comprehensibility may represent a barrier to utilization. As the popularity of these platforms increases, it is important for dermatologists to be aware of these limitations.

Keywords:  AI; Mohs micrographic surgery; artificial intelligence; dermatologic surgery; language learning model; patient education; skin cancer education

DOI:  https://doi.org/10.1111/ijd.17382
Int J Gynaecol Obstet. 2024 Aug 16.

Analyzing the performance of ChatGPT in answering inquiries about cervical cancer.

Engin Yurtcu, Seyfettin Ozvural, Betul Keyif.

   OBJECTIVE: To analyze the knowledge of ChatGPT about cervical cancer (CC).
METHODS: Official websites of professional health institutes, and websites created by patients and charities underwent strict screening. Using CC-related keywords, common inquiries by the public and comments about CC were searched in social media applications with these data, a list of frequently asked questions (FAQs) was prepared. When preparing question about CC, the European Society of Gynecological Oncology (ESGO), European Society for Radiotherapy and Oncology (ESTRO), and European Society of Pathology (ESP) guidelines were used. The answers given by ChatGPT were scored according to the Global Quality Score (GQS).
RESULTS: When all ChatGPT answers to FAQs about CC were evaluated with regard to GQS, 68 ChatGPT answers were classified as score 5, and none of ChatGPT answers for FAQs were scored as 2 or 1. Moreover, ChatGPT answered 33 of 53 (62.3%) CC-related questions based on ESGO, ESTRO, and ESP guidelines with completely accurate and satisfactory responses (GQS 5). In addition, eight answers (15.1%), seven answers (13.2%), four answers (7.5%), and one answer (1.9%) were categorized as GQS 4, GQS 3, GQS 2, and GQS 1, respectively. The reproducibility rate of ChatGPT answers about CC-related FAQs and responses about those guideline-based questions was 93.2% and 88.7%, respectively.
CONCLUSION: ChatGPT had an accurate and satisfactory response rate for FAQs about CC with regards to GQS. However, the accuracy and quality of ChatGPT answers significantly decreased for questions based on guidelines.

Keywords:  ChatGPT; artificial intelligence; cervical cancer; guideline

DOI:  https://doi.org/10.1002/ijgo.15861
Addiction. 2024 Aug 14.

ChatGPT-4: Alcohol use disorder responses.

Alex M Russell, Samuel F Acuff, John F Kelly, Jon-Patrick Allem, Brandon G Bergman.

   BACKGROUND AND AIMS: Alcohol use disorder (AUD) is characterized by low levels of engagement with effective treatments. Enhancing awareness of AUD treatments and how to navigate the treatment system is crucial. Many individuals use online sources (e.g. search engines) for answers to health-related questions; web-based results include a mix of high- and low-quality information. Artificial intelligence may improve access to quality health information by providing concise, high-quality responses to complex health-related questions. This study evaluated the quality of ChatGPT-4 responses to AUD-related queries.
METHOD: A comprehensive list of 64 AUD-related questions was developed through a combination of Google Trends analysis and expert consultation. ChatGPT-4 was prompted with each question, followed by a request to provide 3-5 peer-reviewed scientific citations supporting each response. Responses were evaluated for whether they were evidence-based, provided a referral and provided supporting documentation.
RESULTS: ChatGPT-4 responded to all AUD-related queries, with 92.2% (59/64) of responses being fully evidence-based. Although only 12.5% (8/64) of responses included referrals to external resources, all responses (100%; 5/5) to location-specific ('near me') queries directed individuals to appropriate resources like the NIAAA Treatment Navigator. Most (85.9%; 55/64) responses to the follow-up question provided supporting documentation.
CONCLUSIONS: ChatGPT-4 responds to alcohol use disorder-related questions with evidence-based information and supporting documentation. ChatGPT-4 could be promoted as a reasonable resource for those looking online for alcohol use disorder-related information.

Keywords:  ChatGPT; alcohol; alcohol use disorder; artificial intelligence; drinking; health communication; health information; large language models

DOI:  https://doi.org/10.1111/add.16650
J Med Internet Res. 2024 Aug 14. 26 e55939

Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.

Damien Gibson, Stuart Jackson, Ramesh Shanmugasundaram, Ishith Seth, Adrian Siu, Nariman Ahmadi, Jonathan Kam, Nicholas Mehan, Ruban Thanigasalam, Nicola Jeffery, Manish I Patel, Scott Leslie.

   BACKGROUND: Artificial intelligence (AI) chatbots, such as ChatGPT, have made significant progress. These chatbots, particularly popular among health care professionals and patients, are transforming patient education and disease experience with personalized information. Accurate, timely patient education is crucial for informed decision-making, especially regarding prostate-specific antigen screening and treatment options. However, the accuracy and reliability of AI chatbots' medical information must be rigorously evaluated. Studies testing ChatGPT's knowledge of prostate cancer are emerging, but there is a need for ongoing evaluation to ensure the quality and safety of information provided to patients.
OBJECTIVE: This study aims to evaluate the quality, accuracy, and readability of ChatGPT-4's responses to common prostate cancer questions posed by patients.
METHODS: Overall, 8 questions were formulated with an inductive approach based on information topics in peer-reviewed literature and Google Trends data. Adapted versions of the Patient Education Materials Assessment Tool for AI (PEMAT-AI), Global Quality Score, and DISCERN-AI tools were used by 4 independent reviewers to assess the quality of the AI responses. The 8 AI outputs were judged by 7 expert urologists, using an assessment framework developed to assess accuracy, safety, appropriateness, actionability, and effectiveness. The AI responses' readability was assessed using established algorithms (Flesch Reading Ease score, Gunning Fog Index, Flesch-Kincaid Grade Level, The Coleman-Liau Index, and Simple Measure of Gobbledygook [SMOG] Index). A brief tool (Reference Assessment AI [REF-AI]) was developed to analyze the references provided by AI outputs, assessing for reference hallucination, relevance, and quality of references.
RESULTS: The PEMAT-AI understandability score was very good (mean 79.44%, SD 10.44%), the DISCERN-AI rating was scored as "good" quality (mean 13.88, SD 0.93), and the Global Quality Score was high (mean 4.46/5, SD 0.50). Natural Language Assessment Tool for AI had pooled mean accuracy of 3.96 (SD 0.91), safety of 4.32 (SD 0.86), appropriateness of 4.45 (SD 0.81), actionability of 4.05 (SD 1.15), and effectiveness of 4.09 (SD 0.98). The readability algorithm consensus was "difficult to read" (Flesch Reading Ease score mean 45.97, SD 8.69; Gunning Fog Index mean 14.55, SD 4.79), averaging an 11th-grade reading level, equivalent to 15- to 17-year-olds (Flesch-Kincaid Grade Level mean 12.12, SD 4.34; The Coleman-Liau Index mean 12.75, SD 1.98; SMOG Index mean 11.06, SD 3.20). REF-AI identified 2 reference hallucinations, while the majority (28/30, 93%) of references appropriately supplemented the text. Most references (26/30, 86%) were from reputable government organizations, while a handful were direct citations from scientific literature.
CONCLUSIONS: Our analysis found that ChatGPT-4 provides generally good responses to common prostate cancer queries, making it a potentially valuable tool for patient education in prostate cancer care. Objective quality assessment tools indicated that the natural language processing outputs were generally reliable and appropriate, but there is room for improvement.

Keywords:  AI; AI chatbots; AI language model; ChatGPT; NLP; antigen screening; artificial intelligence; cancer; decision-making; health care professional; health care professionals; large language model; man; medical information; men; multimetric assessment; natural language processing; patient education; prostate; prostate cancer; prostate specific

DOI:  https://doi.org/10.2196/55939
PEC Innov. 2024 Dec 15. 5 100323

Potential misinformation in websites on carpal tunnel syndrome.

Ria Goyal, Grace Corrier, David Ring, Amirreza Fatehi, Sina Ramtin.

   Objective: We sought to evaluate the potential reinforcement of misconceptions in websites discussing carpal tunnel syndrome (CTS).
Methods: After removing all cookies to limit personalization, we entered "carpal tunnel syndrome" into five search engines and collected the first 50 results displayed for each search. For each of the 105 unique websites, we recorded publication date, author background, and number of views. The prevalence of potential reinforcement and/or reorientation of misconceptions for each website was then scored using a rubric based on our interpretation of the best current evidence regarding CTS. The informational quality of websites was graded with the DISCERN instrument, a validated tool for assessing online health information.
Results: Every website contained at least one potentially misleading statement in our opinion. The most common misconceptions reference "excessive motion" and "inflammation." Greater potential reinforcement of misinformation about CTS was associated with fewer page views and lower informational quality scores.
Conclusions: Keeping in mind that this analysis is based on our interpretation of current best evidence, potential misinformation on websites addressing CTS is common and has the potential to increase symptom intensity and magnitude of incapability via reinforcement of unhelpful thoughts regarding symptoms.
Innovation: The prevalence of patient-directed health information that can increase discomfort and incapability by reinforcing common unhelpful thoughts supports the need for innovations in how we develop, oversee, and evolve healthy online material.

Keywords:  Carpal tunnel syndrome; Median nerve; Misinformation; Website

DOI:  https://doi.org/10.1016/j.pecinn.2024.100323
Cureus. 2024 Jul;16(7): e64616

Readability of Online Spanish Materials for Breast Reconstruction Using Deep Inferior Epigastric Perforator Flaps.

Cameron Gerhold, Timothy E Nehila, Virginia Bailey, Bilal Koussayer, Mohammad Tahseen Alkaelani, Nicole K Le, Mahmood Al Bayati, Kristen Whalen, D'Arcy Wainwright, Deniz Dayicioglu.

  Background The internet has become an increasingly popular tool for patients to find information pertaining to medical procedures. Although the information is easily accessible, data shows that many online educational materials pertaining to surgical subspecialties are far above the average reading level in the United States. The aim of this study was to evaluate the English and Spanish online materials for the deep inferior epigastric perforator (DIEP) flap reconstruction procedure. Methods The first eight institutional or organizational websites that provided information on the DIEP procedure in English and Spanish were included. Each website was evaluated using the Patient Education and Materials Assessment Tool (PEMAT), Cultural Sensitivity Assessment Tool (CSAT), and either Simplified Measure of Gobbledygook (SMOG) for English websites or Spanish Orthographic Length (SOL) for Spanish websites. Results The English websites had a statistically lower CSAT score compared to the Spanish websites (p=0.006). However, Spanish websites had a statistically higher percentage of complex words compared to English sources (p<0.001). An analysis of reading grade levels through SMOG and SOL scores revealed that Spanish websites had statistically lower scores (p<0.001). There were no statistically significant differences in the understandability or actionability scores between the English and Spanish websites. Conclusions Online educational materials on the DIEP flap reconstruction procedure should be readable, understandable, actionable, and culturally sensitive. Our analysis revealed that improvements can be made in understandability and actionability on these websites. Plastic surgeons should be aware of what constitutes a great online educational resource and what online educational materials their patients will have access to.

Keywords:  autologous breast reconstruction; deep inferior epigastric artery perforator flap; plastic and reconstructive surgery; procedural education; readability; spanish health education

DOI:  https://doi.org/10.7759/cureus.64616
J Vitreoretin Dis. 2024 Jul-Aug;8(4):8(4): 421-427

Google Trends-Assisted Analysis of the Readability, Accountability, and Accessibility of Online Patient Education Materials for the Treatment of AMD After US FDA Approval of Pegcetacoplan.

Samuel A Cohen, Arthur Brant, Nadim Rayess, Ehsan Rahimy, Carolyn Pan, Ann Caroline Fisher, Suzann Pershing, Diana Do.

  Purpose: To evaluate the readability, accountability, accessibility, and source of online patient education materials for treatment of age-related macular degeneration (AMD) and to quantify public interest in Syfovre and geographic atrophy after US Food and Drug Administration (FDA) approval. Methods: Websites were classified into 4 categories by information source. Readability was assessed using 5 validated readability indices. Accountability was assessed using 4 benchmarks of the Journal of the American Medical Association (JAMA). Accessibility was evaluated using 3 established criteria. The Google Trends tool was used to evaluate temporal trends in public interest in "Syfovre" and "geographic atrophy" in the months after FDA approval. Results: Of 100 websites analyzed, 22% were written below the recommended sixth-grade reading level. The mean (±SD) grade level of analyzed articles was 9.76 ± 3.35. Websites averaged 1.40 ± 1.39 (of 4) JAMA accountability metrics. The majority of articles (67%) were from private practice/independent organizations. A significant increase in the public interest in the terms "Syfovre" and "geographic atrophy" after FDA approval was found with the Google Trends tool (P < .001). Conclusions: Patient education materials related to AMD treatment are often written at inappropriate reading levels and lack established accountability and accessibility metrics. Articles from national organizations ranked highest on accessibility metrics but were less visible on a Google search, suggesting the need for visibility-enhancing measures. Patient education materials related to the term "Syfovre" had the highest average reading level and low accountability, suggesting the need to modify resources to best address the needs of an increasingly curious public.

Keywords:  Syfovre; accessibility; accountability; macular degeneration; online; patient education; readability

DOI:  https://doi.org/10.1177/24741264241250156
JMIR Form Res. 2024 Aug 15. 8 e55535

Evaluation of the Quality and Readability of Web-Based Information Regarding Foreign Bodies of the Ear, Nose, and Throat: Qualitative Content Analysis.

Tsz Ki Ko, Denise Jia Yun Tan, Ka Siu Fan.

   BACKGROUND: Foreign body (FB) inhalation, ingestion, and insertion account for 11% of emergency admissions for ear, nose, and throat conditions. Children are disproportionately affected, and urgent intervention may be needed to maintain airway patency and prevent blood vessel occlusion. High-quality, readable online information could help reduce poor outcomes from FBs.
OBJECTIVE: We aim to evaluate the quality and readability of available online health information relating to FBs.
METHODS: In total, 6 search phrases were queried using the Google search engine. For each search term, the first 30 results were captured. Websites in the English language and displaying health information were included. The provider and country of origin were recorded. The modified 36-item Ensuring Quality Information for Patients tool was used to assess information quality. Readability was assessed using a combination of tools: Flesch Reading Ease score, Flesch-Kincaid Grade Level, Gunning-Fog Index, and Simple Measure of Gobbledygook.
RESULTS: After the removal of duplicates, 73 websites were assessed, with the majority originating from the United States (n=46, 63%). Overall, the quality of the content was of moderate quality, with a median Ensuring Quality Information for Patients score of 21 (IQR 18-25, maximum 29) out of a maximum possible score of 36. Precautionary measures were not mentioned on 41% (n=30) of websites and 30% (n=22) did not identify disk batteries as a risky FB. Red flags necessitating urgent care were identified on 95% (n=69) of websites, with 89% (n=65) advising patients to seek medical attention and 38% (n=28) advising on safe FB removal. Readability scores (Flesch Reading Ease score=12.4, Flesch-Kincaid Grade Level=6.2, Gunning-Fog Index=6.5, and Simple Measure of Gobbledygook=5.9 years) showed most websites (56%) were below the recommended sixth-grade level.
CONCLUSIONS: The current quality and readability of information regarding FBs is inadequate. More than half of the websites were above the recommended sixth-grade reading level, and important information regarding high-risk FBs such as disk batteries and magnets was frequently excluded. Strategies should be developed to improve access to high-quality information that informs patients and parents about risks and when to seek medical help. Strategies to promote high-quality websites in search results also have the potential to improve outcomes.

Keywords:  EQIP; Ensuring Quality Information for Patients; evaluation; evaluations; foreign body; grade level; health information; information resource; information resources; medical informatics; online information; quality; quality of internet information; readability; readability of internet information; readable; reading level; website; websites

DOI:  https://doi.org/10.2196/55535
Ann Plast Surg. 2024 Aug 05.

Quality of Video Content Related to Deep Inferior Epigastric Perforator Flap Breast Reconstruction: Social Media Platforms Versus Large Language Models.

Manuel Viñuela Florido, Javier Suárez Aguilar, Andrés A Maldonado, Lara Cristóbal Velasco.

INTRODUCTION: The deep inferior epigastric perforator (DIEP) flap is currently one of the main options in breast reconstruction. The information about this surgery is critical for the patient, in order to choose the breast reconstruction method. Our study aims to analyze and compare the quality and accuracy of the information related to the DIEP flap reconstruction method provided by social media platforms (SMPs) and the new large language models (LLMs).
MATERIALS AND METHODS: A total of 50 videos in English and Spanish were selected from the main SMPs (YouTube, Instagram, and Facebook) using the keywords "DIEP flap" and "colgajo DIEP." The duration, number of likes, dislikes, number of visits, upload date, author, and the video category (institutional video, media, patient experience, academic, and surgery) were analyzed. 3 specific questions were asked to 2 new LLMs (ChatGPT and Google Bard). The quality of information in SMPs and LLMs was analyzed and compared by 2 independent board-certified plastic surgeons using the Journal of American Medical Association and DISCERN scales.
RESULTS: LLMs showed a statistically significant higher quality of information when compared with SMPs based on the DISCERN scores. The average DISCERN scores for answers given by ChatGPT and Google Bard were 54 ± 6.841 and 61.17 ± 6.306, respectively (good quality). In SMPs, the average scores were 2.31 ± 0.67 (insufficient quality) and 32.87 ± 9.62 (low quality) for the Journal of American Medical Association and DISCERN scales respectively. Thirty-eight percent of the videos in SMPs were performed by nonmedical authors.
CONCLUSIONS: The quality of information for breast reconstruction using DIEP flaps from LLMs was considered good and significantly better than in SMPs. The information found in SMPs was insufficient and of low quality. Academic plastic surgeons have an opportunity to provide quality content on this type of reconstruction in LLM and SMPs.

DOI: https://doi.org/10.1097/SAP.0000000000004045
Prev Med Rep. 2024 Sep;45 102836

Is it safe? Health promotion videos on YouTube and the safety of viewers - Views from Ghana.

Martin Gameli Akakpo, Evelyn Owusu Roberts.

   Objectives: Health promotion videos are trending and abundant. Information provided in these videos is not verified by any designated experts but is popular. In this paper we discuss the prevalence of such videos and guide patients on how to verify their authenticity. The paper accepts that these videos are abundant and necessary in an age driven by open access to information and commercial interests.
Methods: The paper uses evidence from previous studies and observations of authors to propose the inclusion of YouTube in the health promotion toolkit of Ghanaian and African health systems.
Results: The paper proposes the improvement of health literacy and patient-caregiver communication in preparation for an active role for YouTube as a health promotion tool.
Conclusions: For patients, the paper recommends improved health literacy and communication with caregivers as an effective safety mechanism against misleading content. Caregivers are advised to accommodate patient views influenced by YouTube videos and be active participants in online spaces. Research on health literacy and effective patient-caregiver communication is recommended.

Keywords:  Caregivers; Health literacy; Health promotion; Patient-caregiver communication; Patients; YouTube

DOI:  https://doi.org/10.1016/j.pmedr.2024.102836
BMC Public Health. 2024 Aug 13. 24(1): 2208

YouTube as a source of recognizing acute stroke; progress in 2 years.

Zeynep Özdemir, Erkan Acar.

   BACKGROUND: YouTube™ has a great role in providing information, which includes educational videos, to more than 2 billion users, making it the second most popular application in the world. BE-FAST is a modified version of the FAST mnemonic and is used to detect acute ischemic stroke by the patients or their relatives. The purpose of this study is to assess the overall usefulness of the information of YouTube in patients to realize an acute stroke attack.
METHODS: YouTube was searched for the following five terms: "stroke", ''stroke diagnosis", "stroke signs", "brain attack" and "what is stroke" in November 2021 and May 2023, separately. Two independent neurology specialists scored each video by using Global Quality Scale (GQS).
RESULTS: Among the total of 150 videos, the number that met inclusion criteria was 91 for the November 2021 search and 104 for the May 2023 search. For the 2021 search, in 30 videos (33%), the FAST mnemonic or its contents were noticed, whereas BE-FAST was mentioned in only four videos (4.4%). For the 2023 search, the FAST mnemonic or its contents were noticed in 36 videos (34.6%) and BE-FAST was mentioned in 11 videos (10.6%). Among the 2021 and 2023 searches, the mean GQS values were 3.09 and 2.96 points, 50 (54.8%) vs. 56 (53.8%) videos rated 3.5 points or higher (high quality), respectively. GQS scores of the videos mentioning balance, eyes, face, arms, speech, and time, the basic and advanced information about radiology and treatment, and mentioning FAST, BE-FAST, and TPA were significantly higher.
CONCLUSION: We conclude that YouTube is not yet a very useful tool for patients to realize that they may have acute ischemic stroke, though over the years; information available on social media for healthcare information and education has improved.

Keywords:  BE-FAST; FAST; Stroke; YouTube; tPA

DOI:  https://doi.org/10.1186/s12889-024-19710-4
Int J Sex Health. 2024 ;36(3): 406-414

YouTube Videos Are a Moderately Comprehensive, Reliable, and Quality Option to Learn About "Multiple Sclerosis and Sexuality".

Esra Uslu, Gülcan Kendırkiran, Nazmiye Yildirim.

   Objectives: This study aimed to evaluate the performance, comprehensiveness, reliability, and quality of English-language YouTube videos addressing the subject of multiple sclerosis and sexuality.
Methods: In August 2023, a search was conducted on a computer using the keywords "multiple sclerosis and sexuality," "multiple sclerosis and sexual health," "multiple sclerosis and sexual health problems," and "multiple sclerosis and sexual dysfunction" for this descriptive study. According to predetermined inclusion and exclusion criteria, 38 videos that met the research purpose were examined. The related URLs were recorded. For each video, the following information was collected: content producers, performance with YouTube statistics, comprehensiveness with a form developed by researchers, reliability with Singh's Reliability Evaluation Form, and quality with Global Quality Scale. Two researchers independently evaluated the videos.
Results: Eighty-nine and a half percent of the videos contained information presented by professionals. The average number of views was 2699.132 ± 3382.848, the comprehensiveness score was 4.2 ± 1.711, the reliability score was 3.184 ± 1.182, and the quality score was 3.421 ± 1.2. Nearly half (42.2%) contained good and useful information for viewers, and half (50%) had high video quality. The reliability and quality scores of videos containing each item in terms of comprehensiveness were higher compared to videos that did not include that item (p < 0.05). In addition, the videos with higher comprehensiveness scores had higher quality and reliability scores (p < 0.001).
Conclusion: These results underscore the constrained performance attributes of YouTube videos addressing multiple sclerosis and sexuality, with their content exhibiting a moderate level of comprehensiveness, reliability, and quality. These results may provide a basis for increasing the effectiveness of YouTube videos on multiple sclerosis and sexuality.

Keywords:  Multiple sclerosis; YouTube; quality; reliability; sexuality

DOI:  https://doi.org/10.1080/19317611.2024.2349597
Cureus. 2024 Jul;16(7): e64322

Diverticulosis and Diverticulitis on YouTube: Is Popular Information the Most Reliable?

Maverick H Johnson, Goutham A Nair, Courtney K Mack, Sean O'leary, Chris J Thang, Rui-Min D Mao, Nikhil R Shah, Uma R Phatak.

  Background Patients utilize online health information to inform their medical decision-making. YouTube is one of the most popular media platforms with abundant health-related resources, yet the quality of the disseminated information remains unclear. This study aims to evaluate the quality and reliability of content pertaining to diverticulosis and diverticulitis on YouTube. Methods One author queried the terms "diverticulosis," "diverticulitis," "acute diverticulitis," and "chronic diverticulitis" on YouTube. The first 50 videos per search were selected for analysis. Duplicates, non-English videos, or procedural content were excluded. Video characteristics including view count, likes, comments, duration, days since upload, view ratio, video power index, and video sources (professional organizations (POs), health information websites (HIWs), and entertainment/independent users (EIUs)) were collected. Videos were scored using the mDISCERN and Global Quality Score (GQS). Results Sixty-four videos were included. DISCERN scores significantly differed between POs (n=20, mean=4.35), HIWs (n=29, mean=2.97), and EIUs (n=15, mean=1.83). GQS also significantly differed between POs (n=20, mean=4.47), HIWs (n=29, mean=3.62), and EIUs (n=15, mean=2.5). Video characteristics significantly differed between groups, with most user engagement seen in EIUs. Conclusion POs and HIWs disseminate higher quality health information about diverticular disease on YouTube. The higher viewer engagement with EIUs is concerning, as these sources were found to have lower quality content. Although YouTube has the capability to provide valuable information on diverticulosis and diverticulitis, enhanced content screening is needed to ensure accuracy and validation.

Keywords:  diverticulitis; diverticulosis; patient resources; social media; youtube

DOI:  https://doi.org/10.7759/cureus.64322
Health Informatics J. 2024 Jul-Sep;30(3):30(3): 14604582241275824

Evaluation of treatment information quality on hypertension and diabetes on WeChat and TikTok: A cross-sectional content analysis.

Minxia Wu, Yongmei Yang, Yanxing Chen.

  Objective: This study aimed to assess the quality of the information in WeChat and TikTok videos related to hypertension and diabetes treatment. Methods: A sample of 120 Chinese videos was collected based on specific inclusion and exclusion criteria. The quality was evaluated using DISCERN, JAMA and the latest edition of Chinese guidelines for hypertension and diabetes prevention and treatment, and two observers independently scored each video using the three assessment tools. Results: Among all 120 videos, only 10 scored above 38 points in DISCERN, with 45 videos rated as "very poor". None of the video met all JAMA criteria simultaneously, and there were gaps in accuracy and completeness compared to the two guidelines. Furthermore, there was no significant correlation between information quality and the number of likes and comments. Conclusion: The current quality of information on the treatment of hypertension and diabetes on WeChat and TikTok was unsatisfactory. Consequently, the government should strengthen oversight of information quality, and social media platforms should actively review health-related content to prevent inaccurate information dissemination. Individuals should enhance their digital and health literacy.

Keywords:  Chinese videos; WeChat and TikTok; hypertension and diabetes; quality evaluation; treatment-related information

DOI:  https://doi.org/10.1177/14604582241275824
J Med Internet Res. 2024 Aug 14. 26 e54745

Sociodemographic Factors Associated With Using eHealth for Information Seeking in the United States: Cross-Sectional Population-Based Study With 3 Time Points Using Health Information National Trends Survey Data.

Christian Elias Vazquez, Rebecca L Mauldin, Denise N Mitchell, Faheem Ohri.

   BACKGROUND: Despite the potential benefits of using eHealth, sociodemographic disparities exist in eHealth use, which threatens to further widen health equity gaps. The literature has consistently shown age and education to be associated with eHealth use, while the findings for racial and ethnic disparities are mixed. However, previous disparities may have narrowed as health care interactions shifted to web-based modalities for everyone because of the COVID-19 pandemic.
OBJECTIVE: This study aims to provide an updated examination of sociodemographic disparities that contribute to the health equity gap related to using eHealth for information seeking using 3 time points.
METHODS: Data for this study came from the nationally representative 2018 (n=3504), 2020 (n=3865), and 2022 (n=6252) time points of the Health Information National Trends Survey. Logistic regression was used to regress the use of eHealth for information seeking on race and ethnicity, sex, age, education, income, health status, and year of survey. Given the consistent association of age with the dependent variable, analyses were stratified by age cohort (millennials, Generation X, baby boomers, and silent generation) to compare individuals of similar age.
RESULTS: For millennials, being female, attaining some college or a college degree, and reporting an annual income of US $50,000-$74,999 or >US $75,000 were associated with the use of eHealth for information seeking. For Generation X, being female, having attained some college or a college degree, reporting an annual income of US $50,000-$74,999 or >US $75,000, better self-reported health, and completing the survey in 2022 (vs 2018; odds ratio [OR] 1.80, 95% CI 1.11-2.91) were associated with the use of eHealth for information seeking. For baby boomers, being female, being older, attaining a high school degree, attaining some college or a college degree, reporting an annual income of US $50,000-$74,999 or >US $75,000, and completing the survey in 2020 (OR 1.56, 95% CI 1.15-2.12) and 2022 (OR 4.04, 95% CI 2.77-5.87) were associated with the use of eHealth for information seeking. Among the silent generation, being older, attaining some college or a college degree, reporting an annual income of US $50,000-$74,999 or >US $75,000, and completing the survey in 2022 (OR 5.76, 95% CI 3.05-10.89) were associated with the use of eHealth for information seeking.
CONCLUSIONS: Baby boomers may have made the most gains in using eHealth for information seeking over time. The race and ethnicity findings, or lack thereof, may indicate a reduction in racial and ethnic disparities. Disparities based on sex, education, and income remained consistent across all age groups. This aligns with health disparities literature focused on individuals with lower socioeconomic status, and more recently on men who are less likely to seek health care compared to women.

Keywords:  age; disparities; eHealth use; education; health information seeking; mobile phone; sex

DOI:  https://doi.org/10.2196/54745
IEEE/ACM Trans Comput Biol Bioinform. 2024 Aug 13. PP

Joint Extraction of Biomedical Events Based on Dynamic Path Planning Strategy and Hybrid Neural Network.

Xinyu He, Yujie Tang, Bo Yu, Shixin Li, Yonggong Ren.

Biomedical event detection is a pivotal information extraction task in molecular biology and biomedical research, which provides inspiration for the medical search, disease prevention, and new drug development. The existing methods usually detect simple biomedical events and complex events with the same model, and the performance of the complex biomedical event extraction is relatively low. In this paper, we build different neural networks for simple and complex events respectively, which helps to promote the performance of complex event extraction. To avoid redundant information, we design dynamic path planning strategy for argument detection. To take full use of the information between the trigger identification and argument detection subtasks, and reduce the cascading errors, we build a joint event extraction model. Experimental results demonstrate our approach achieves the best F-score on the biomedical benchmark MLEE dataset and outperforms the recent state-of-the-art methods.

DOI: https://doi.org/10.1109/TCBB.2024.3442199