bims-librar Biomed News
on Biomedical librarianship
Issue of 2024–09–22
fiveteen papers selected by
Thomas Krichel, Open Library Society



  1. J Thorac Cardiovasc Surg. 2024 Sep 13. pii: S0022-5223(24)00809-2. [Epub ahead of print]
       OBJECTIVES: The task of writing structured content reviews and guidelines has grown stronger and more complex. We propose to go beyond search tools, toward curation tools, by automating time-consuming and repetitive steps of extracting and organizing information.
    METHODS: SciScribe is built as an extension of IBM's Deep Search platform, which provides document processing and search capabilities. This platform was used to ingest and search full-content publications from PubMed Central (PMC) and official, structured records from the ClinicalTrials and OpenPayments databases. Author names and NCT numbers, mentioned within the publications, were used to link publications to these official records as context. Search strategies involve traditional keyword-based search as well as natural language question and answering via large language models (LLMs).
    RESULTS: SciScribe is a web-based tool that helps accelerate literature reviews through key features: 1. Accumulate a personal collection from publication sources, such as PMC or other sources; 2. Incorporate contextual information from external databases into the presented papers, promoting a more informed assessment by readers. 3. Semantic question and answering of a document to quickly assess relevance and hierarchical organization. 4. Semantic question answering for each document within a collection, collated into tables.
    CONCLUSIONS: Emergent language processing techniques open new avenues to accelerate and enhance the literature review process, for which we have demonstrated a use case implementation within cardiac surgery. SciScribe automates and accelerates this process, mitigates errors associated with repetition and fatigue, as well as contextualizes results by linking relevant external data sources, instantaneously.
    Keywords:  ClinicalTrials; Contextualization; GenAI; Generative AI; Large Language Model; Literature Review; Literature Search; OpenPayments; PubMed
    DOI:  https://doi.org/10.1016/j.jtcvs.2024.09.014
  2. Comput Struct Biotechnol J. 2024 Dec;23 3247-3253
      The process of navigating through the landscape of biomedical literature and performing searches or combining them with bioinformatics analyses can be daunting, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related repositories. Herein, we present BioTextQuest v2.0, a tool for biomedical literature mining. BioTextQuest v2.0 is an open-source online web portal for document clustering based on sets of selected biomedical terms, offering efficient management of information derived from PubMed abstracts. Employing established machine learning algorithms, the tool facilitates document clustering while allowing users to customize the analysis by selecting terms of interest. BioTextQuest v2.0 streamlines the process of uncovering valuable insights from biomedical research articles, serving as an agent that connects the identification of key terms like genes/proteins, diseases, chemicals, Gene Ontology (GO) terms, functions, and others through named entity recognition, and their application in biological research. Instead of manually sifting through articles, researchers can enter their PubMed-like query and receive extracted information in two user-friendly formats, tables and word clouds, simplifying the comprehension of key findings. The latest update of BioTextQuest leverages the EXTRACT named entity recognition tagger, enhancing its ability to pinpoint various biological entities within text. BioTextQuest v2.0 acts as a research assistant, significantly reducing the time and effort required for researchers to identify and present relevant information from the biomedical literature.
    Keywords:  Biomedical literature mining; Concept discovery
    DOI:  https://doi.org/10.1016/j.csbj.2024.08.016
  3. Heliyon. 2024 Sep 15. 10(17): e36351
       Background: The ever-increasing volume of academic literature necessitates efficient and sophisticated tools for researchers to analyze, interpret, and uncover trends. Traditional search methods, while valuable, often fail to capture the nuance and interconnectedness of vast research domains.
    Results: TopicTracker, a novel software tool, addresses this gap by providing a comprehensive solution from querying PubMed databases to creating intricate semantic network maps. Through its functionalities, users can systematically search for desired literature, analyze trends, and visually represent co-occurrences in a given field. Our case studies, including support for the WHO on ethical considerations in infodemic management and mapping the evolution of ethics pre- and post-pandemic, underscore the tool's applicability and precision.
    Conclusions: TopicTracker represents a significant advancement in academic research tools for text mining. While it has its limitations, primarily tied to its alignment with PubMed, its benefits far outweigh the constraints. As the landscape of research continues to expand, tools like TopicTracker may be instrumental in guiding scholars in their pursuit of knowledge, ensuring they navigate the large amount of literature with clarity and precision.
    Keywords:  Automated literature review; Customizable text mining; Open-source text analysis; PubMed data analysis; Reproducible workflow; Scientific literature processing; Text mining pipeline
    DOI:  https://doi.org/10.1016/j.heliyon.2024.e36351
  4. BMC Bioinformatics. 2024 Sep 16. 25(1): 303
       BACKGROUND: Literature-based discovery (LBD) aims to help researchers to identify relations between concepts which are worthy of further investigation by text-mining the biomedical literature. While the LBD literature is rich and the field is considered mature, standard practice in the evaluation of LBD methods is methodologically poor and has not progressed on par with the domain. The lack of properly designed and decent-sized benchmark dataset hinders the progress of the field and its development into applications usable by biomedical experts.
    RESULTS: This work presents a method for mining past discoveries from the biomedical literature. It leverages the impact made by a discovery, using descriptive statistics to detect surges in the prevalence of a relation across time. The validity of the method is tested against a baseline representing the state-of-the-art "time-sliced" method.
    CONCLUSIONS: This method allows the collection of a large amount of time-stamped discoveries. These can be used for LBD evaluation, alleviating the long-standing issue of inadequate evaluation. It might also pave the way for more fine-grained LBD methods, which could exploit the diversity of these past discoveries to train supervised models. Finally the dataset (or some future version of it inspired by our method) could be used as a methodological tool for systematic reviews. We provide an online exploration tool in this perspective, available at https://brainmend.adaptcentre.ie/ .
    Keywords:  Benchmark dataset; Evaluation; Literature-based discovery; Time-sliced method
    DOI:  https://doi.org/10.1186/s12859-024-05881-9
  5. J Exp Orthop. 2024 Jul;11(3): e70019
       Purpose: The internet has become a primary source for patients seeking healthcare information, but the quality of online information, particularly in orthopaedics, often falls short. Orthopaedic surgeons now have the added responsibility of evaluating and guiding patients to credible online resources. This study aimed to assess ChatGPT's ability to identify deficiencies in patient information texts related to total hip arthroplasty websites and to evaluate its potential for enhancing the quality of these texts.
    Methods: In August 2023, 25 websites related to total hip arthroplasty were assessed using a standardized search on Google. Peer-reviewed scientific articles, empty pages, dictionary definitions, and unrelated content were excluded. The remaining 10 websites were evaluated using the hip information scoring system (HISS). ChatGPT was then used to assess these texts, identify deficiencies and provide recommendations.
    Results: The mean HISS score of the websites was 9.5, indicating low to moderate quality. However, after implementing ChatGPT's suggested improvements, the score increased to 21.5, signifying excellent quality. ChatGPT's recommendations included using simpler language, adding FAQs, incorporating patient experiences, addressing cost and insurance issues, detailing preoperative and postoperative phases, including references, and emphasizing emotional and psychological support. The study demonstrates that ChatGPT can significantly enhance patient information quality.
    Conclusion: ChatGPT's role in elevating patient education regarding total hip arthroplasty is promising. This study sheds light on the potential of ChatGPT as an aid to orthopaedic surgeons in producing high-quality patient information materials. Although it cannot replace human expertise, it offers a valuable means of enhancing the quality of healthcare information available online.
    Level of Evidence: Level IV.
    Keywords:  ChatGPT; arthroplasty; artificial intelligence; healthcare information; hip; orthopaedic surgery; patient‐centered care; total hip replacement
    DOI:  https://doi.org/10.1002/jeo2.70019
  6. JSES Int. 2024 Sep;8(5): 1016-1018
       Background: The aim of this study is to evaluate whether Chat Generative Pretrained Transformer (ChatGPT) can be recommended as a resource for informing patients planning rotator cuff repairs, and to assess the differences between ChatGPT 3.5 and 4.0 versions in terms of information content and readability.
    Methods: In August 2023, 13 commonly asked questions by patients with rotator cuff disease were posed to ChatGPT 3.5 and ChatGPT 4 programs using different internet protocol computers by 3 experienced surgeons in rotator cuff surgery. After converting the answers of both versions into text, the quality and readability of the answers were examined.
    Results: The average Journal of the American Medical Association score for both versions was 0, and the average DISCERN score was 61.6. A statistically significant and strong correlation was found between ChatGPT 3.5 and 4.0 DISCERN scores. There was excellent agreement in DISCERN scores for both versions among the 3 evaluators. ChatGPT 3.5 was found to be less readable than ChatGPT 4.0.
    Conclusion: The information provided by the ChatGPT conversational system was evaluated as of high quality, but there were significant shortcomings in terms of reliability due to the lack of citations. Despite the ChatGPT 4.0 version having higher readability scores, both versions were considered difficult to read.
    Keywords:  Arthroscopy; Artificial intelligence; ChatGPT; OpenAI; Rotator cuff; Shoulder
    DOI:  https://doi.org/10.1016/j.jseint.2024.04.016
  7. OTO Open. 2024 Jul-Sep;8(3):8(3): e70011
       Objective: While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.
    Study Design: Quantitative observational study.
    Setting: Publicly available online website.
    Methods: ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was "prompted" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using χ 2 analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.
    Results: Across all graded responses (n = 480), 364 responses (75.8%) were "at least good" (GCS ≥ 4). Proportions of responses that were "at least good" (P < .0001) or "excellent" (GCS = 5) (P < .0001) differed by prompt; "at least moderate" (GCS ≥ 3) responses did not (P = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (P < .0001).
    Conclusion: ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.
    Keywords:  AI hallucination; COVID‐19; ChatGPT; Flesch‐Kincaid grade level; anosmia; artificial intelligence; chatbot; olfactory dysfunction; patient education; prompting
    DOI:  https://doi.org/10.1002/oto2.70011
  8. Ann Rheum Dis. 2024 Sep 18. pii: ard-2024-226202. [Epub ahead of print]
       OBJECTIVES: The aim of this study was to assess the accuracy and readability of the answers generated by large language model (LLM)-chatbots to common patient questions about low back pain (LBP).
    METHODS: This cross-sectional study analysed responses to 30 LBP-related questions, covering self-management, risk factors and treatment. The questions were developed by experienced clinicians and researchers and were piloted with a group of consumer representatives with lived experience of LBP. The inquiries were inputted in prompt form into ChatGPT 3.5, Bing, Bard (Gemini) and ChatGPT 4.0. Responses were evaluated in relation to their accuracy, readability and presence of disclaimers about health advice. The accuracy was assessed by comparing the recommendations generated with the main guidelines for LBP. The responses were analysed by two independent reviewers and classified as accurate, inaccurate or unclear. Readability was measured with the Flesch Reading Ease Score (FRES).
    RESULTS: Out of 120 responses yielding 1069 recommendations, 55.8% were accurate, 42.1% inaccurate and 1.9% unclear. Treatment and self-management domains showed the highest accuracy while risk factors had the most inaccuracies. Overall, LLM-chatbots provided answers that were 'reasonably difficult' to read, with a mean (SD) FRES score of 50.94 (3.06). Disclaimer about health advice was present around 70%-100% of the responses produced.
    CONCLUSIONS: The use of LLM-chatbots as tools for patient education and counselling in LBP shows promising but variable results. These chatbots generally provide moderately accurate recommendations. However, the accuracy may vary depending on the topic of each question. The reliability level of the answers was inadequate, potentially affecting the patient's ability to comprehend the information.
    Keywords:  Internet; Low Back Pain; Pain
    DOI:  https://doi.org/10.1136/ard-2024-226202
  9. Aesthetic Plast Surg. 2024 Sep 16.
       OBJECTIVE: Assessment of the readability, accuracy, quality, and completeness of ChatGPT (Open AI, San Francisco, CA), Gemini (Google, Mountain View, CA), and Claude (Anthropic, San Francisco, CA) responses to common questions about rhinoplasty.
    METHODS: Ten questions commonly encountered in the senior author's (SPM) rhinoplasty practice were presented to ChatGPT-4, Gemini and Claude. Seven Facial Plastic and Reconstructive Surgeons with experience in rhinoplasty were asked to evaluate these responses for accuracy, quality, completeness, relevance, and use of medical jargon on a Likert scale. The responses were also evaluated using several readability indices.
    RESULTS: ChatGPT achieved significantly higher evaluator scores for accuracy, and overall quality but scored significantly lower on completeness compared to Gemini and Claude. All three chatbot responses to the ten questions were rated as neutral to incomplete. All three chatbots were found to use medical jargon and scored at a college reading level for readability scores.
    CONCLUSIONS: Rhinoplasty surgeons should be aware that the medical information found on chatbot platforms is incomplete and still needs to be scrutinized for accuracy. However, the technology does have potential for use in healthcare education by training it on evidence-based recommendations and improving readability.
    LEVEL OF EVIDENCE V: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
    Keywords:  Artificial intelligence; Large language models; Rhinoplasty
    DOI:  https://doi.org/10.1007/s00266-024-04343-0
  10. NeuroRehabilitation. 2024 Sep 18.
       BACKGROUND: YouTube has emerged as an important source for obtaining information regarding health issues.
    OBJECTIVE: The study aimed to assess the reliability and quality of facial paralysis exercise videos that are accessible on the YouTube platform.
    METHODS: The investigation was carried out on Youtube, utilizing the keyword "facial paralysis exercises". We listed the first 100 videos based on relevancy. The quality and reliability of the videos were assessed using DISCERN, the Journal of the American Medical Association (JAMA) Benchmark Criteria, the Global Quality Scale (GQS), and the Video Power Index (VPI).
    RESULTS: Out of 100 studies, we excluded 52 and included the remaining 48. The scores we obtained for the videos were as follows: DISCERN Quality (2.92±0.91), DISCERN Total (39.16±6.75), JAMA (2.09±0.55), and GQS (3.00±0.89). Our study also revealed that videos uploaded by healthcare professionals had significantly higher DISCERN total, JAMA and VPI scores compared to those uploaded by non-healthcare professionals (p = 0.018, 0.001 and 0.023, respectively). Additionally, we observed a positive and statistically significant correlation between the DISCERN quality score, total score, JAMA, and video features.
    CONCLUSION: The facial paralysis exercise videos were determined to be of medium to low quality. Higher-quality videos need to be produced.
    Keywords:  Bell’s palsy; exercise; facial paralysis; information sources; internet; neurological rehabilitation; rehabilitation
    DOI:  https://doi.org/10.3233/NRE-240027
  11. Narra J. 2024 08;4(2): e877
      Social media platforms, including TikTok, have become influential sources of health information. However, they also present as potential sources for the spread of vaccine misinformation. The aim of this study was to assess the quality of measles-rubella (MR) vaccine-related contents on TikTok in Jordan and to analyze factors associated with vaccine misinformation. A systematic search for MR vaccine-related TikTok contents in Jordan was conducted using pre-defined keywords and a specified time range. Content metrics (likes, comments, shares, and saves) were collected while the content quality of health information was evaluated using a modified version of the DISCERN, a validated instrument by two expert raters. The average modified DISCERN score ranged from 1, denoting poor content, to 5, indicating excellent content. A total of 50 videos from 34 unique content creators formed the final study sample. The majority of MR vaccine-related content was created by lay individuals (61.8%), followed by TV/news websites/journalists (23.5%), and healthcare professionals (HCPs) (14.7%). The Cohen κ per modified DISCERN item was in the range of 0.579-0.808, p<0.001), indicating good to excellent agreement. The overall average modified DISCERN score was 2±1.2, while it was only 1.3±0.52 for lay individuals' content, which indicated poor content quality. For the normalized per number of followers for each source, content by lay individuals had a significantly higher number of likes, saves, and shares with p=0.009, 0.012, and 0.004, respectively. Vaccine misinformation was detected in 58.8% of the videos as follows: lay individuals (85.7%), TV/news websites/journalists (25.0%), and HCPs content had none (p<0.001). Normalized per the number of followers for each source, videos flagged as having MR vaccine misinformation reached a higher number of likes, saves, and shares (p=0.012, 0.016, and 0.003, respectively). In conclusion, substantial dissemination of TikTok MR vaccine-related misinformation in Jordan was detected. Rigorous fact-checking is warranted by the platform to address misinformation on TikTok, which is vital to improve trust in MR vaccination and ultimately protect public health.
    Keywords:  Vaccine misinformation; health communication; public health; social media; vaccine hesitancy
    DOI:  https://doi.org/10.52225/narra.v4i2.877
  12. Front Public Health. 2024 ;12 1446003
       Background: The prevalence of heatstroke is rising due to global warming, making it a serious but preventable condition, highlighting the urgent need for effective dissemination of relevant health education to the general public. Advances in technology have made accessing health information more convenient and rapid. In recent years, short videos have become a primary medium for delivering health education, with TikTok gaining considerable popularity among the general public. However, the quality of heatstroke-related health education content on TikTok deserves closer scrutiny.
    Objective: This study aimed to evaluate the quality and content of heatstroke-related videos available on TikTok.
    Methods: The present study analyzed the top 100 heatstroke-related short videos on TikTok, focusing on their characteristics, quality, and the content they conveyed. The quality of these videos was assessed using the DISCERN instrument. In addition, the completeness of the videos was assessed by examining six key aspects: disease definition, clinical manifestations, risk factors, assessment, management, and outcomes.
    Results: The study included a total of 90 videos. The results showed that news organizations and healthcare professionals were the primary contributors to these videos, with those from news organizations receiving the most attention. In contrast, those from healthcare professionals received comparatively less engagement. Overall, the quality of the information was found to be moderately low, with the highest quality videos posted by non-profit organizations, followed by those posted by healthcare professionals. The majority of videos uploaded described the disease definition, clinical presentation, risk factors, assessment, management, and outcomes of heatstroke.
    Conclusion: The quality of information provided in heatstroke-related short videos on TikTok is generally inadequate and requires significant improvement. In addition, such content should be subject to government review to ensure its accuracy and reliability.
    Keywords:  DISCERN; TikTok; heatstroke; information quality; social media
    DOI:  https://doi.org/10.3389/fpubh.2024.1446003
  13. J Prim Care Community Health. 2024 Jan-Dec;15:15 21501319241277576
      Health Information Seeking Behavior (HISB) refers to the behavior and strategies used to attain, clarify, or confirm health information. The uptake of health information depends on system-level and individual-level factors. The purpose of the present study is to understand the sources from which Punjabi elders obtain COVID-19 vaccine-related information and their information seeking behavior. A cross-sectional survey was conducted among 391 Punjabi elders aged 50+ years in the Greater Toronto Area (GTA), Ontario. The survey questions included the need for COVID-19 vaccine information, the type of information sought, sources of information, and barriers to seeking information. Descriptive analysis was conducted using frequencies and percentages, and logistic regression was performed to understand the associations between participants' sociodemographic characteristics and HISB. The results suggested that Punjabi elders are more likely to use informal sources and less likely to seek information from health professionals and government health and wellness websites. The results also suggested that most participants do not cross-check their information with other sources and are more likely to cross-check the information with family/friends, compared to credible care providers, across all demographics. Ultimately, there may be a need for stakeholders to collaborate to regulate the accuracy and type of health-information that is disseminated through media, and to tailor health communication to the health information seeking behavior of this population.
    Keywords:  COVID-19; Health Information Seeking Behavior; elderly populations; health communication; immigrants
    DOI:  https://doi.org/10.1177/21501319241277576
  14. Nurs Open. 2024 Sep;11(9): e70029
       AIM: Explore Australian-Chinese immigrants' health literacy and preferences and engagement with translated diabetes self-management patient education materials.
    DESIGN: The cross-sectional survey was conducted with Australian-Chinese immigrants at risk or with type 2 diabetes recruited via health services, and diabetes and community organisations.
    METHODS: The survey had three parts: (1) diabetes screening; (2) sociodemographic information, clinical characteristics and preferences for translated materials; and (3) Functional, Communicative and Critical Health Literacy (FCCHL) Scale.
    RESULTS: Of 381 participants, 54.3% reported diabetes (n = 207), the remainder pre-diabetes or at risk (45.7%, n = 174); 34.1% male; mean age 64.1 years. Average total health literacy (FCCHL) scores were 35.3/56 (SD = 8.7). Participants with greater English proficiency reported higher health literacy (p < 0.001). This pattern also existed for functional (p < 0.001), communicative (p = 0.007) and critical (p = 0.041) health literacy subdomains. Health literacy scores did not differ significantly based on years of residence in Australia (all p > 0.05). Although the majority of participants (75.6%, N = 288) were willing to receive translated diabetes information, only a small proportion (19.7%, N = 75) reporting receiving such materials.
    CONCLUSION: There is a clear need for co-designed diabetes patient education materials that meet the needs and adequately reach Australian-Chinese immigrants. In particular, these materials must support people with limited English-language proficiency.
    IMPLICATIONS FOR NURSING PRACTICE: This study highlights important considerations for nurses seeking to improve diabetes care for Chinese immigrants when incorporating patient education materials as part of their nursing education.
    Keywords:  Chinese immigrants; diabetes; health literacy; patient education materials
    DOI:  https://doi.org/10.1002/nop2.70029