bims-librar Biomed News
on Biomedical librarianship
Issue of 2021–05–09
nineteen papers selected by
Thomas Krichel, Open Library Society



  1. PLoS One. 2021 ;16(5): e0234221
      This study compared the results of data collected from a longitudinal query analysis of the MEDLINE database hosted on multiple platforms that include PubMed, EBSCOHost, Ovid, ProQuest, and Web of Science. The goal was to identify variations among the search results on the platforms after controlling for search query syntax. We devised twenty-nine cases of search queries comprised of five semantically equivalent queries per case to search against the five MEDLINE database platforms. We ran our queries monthly for a year and collected search result count data to observe changes. We found that search results varied considerably depending on MEDLINE platform. Reasons for variations were due to trends in scholarly publication such as publishing individual papers online first versus complete issues. Some other reasons were metadata differences in bibliographic records; differences in the levels of specificity of search fields provided by the platforms and large fluctuations in monthly search results based on the same query. Database integrity and currency issues were observed as each platform updated its MEDLINE data throughout the year. Specific biomedical bibliographic databases are used to inform clinical decision-making, create systematic reviews, and construct knowledge bases for clinical decision support systems. They serve as essential information retrieval and discovery tools to help identify and collect research data and are used in a broad range of fields and as the basis of multiple research designs. This study should help clinicians, researchers, librarians, informationists, and others understand how these platforms differ and inform future work in their standardization.
    DOI:  https://doi.org/10.1371/journal.pone.0234221
  2. Nucleic Acids Res. 2021 May 05. pii: gkab326. [Epub ahead of print]
      Searching and reading relevant literature is a routine practice in biomedical research. However, it is challenging for a user to design optimal search queries using all the keywords related to a given topic. As such, existing search systems such as PubMed often return suboptimal results. Several computational methods have been proposed as an effective alternative to keyword-based query methods for literature recommendation. However, those methods require specialized knowledge in machine learning and natural language processing, which can make them difficult for biologists to utilize. In this paper, we propose LitSuggest, a web server that provides an all-in-one literature recommendation and curation service to help biomedical researchers stay up to date with scientific literature. LitSuggest combines advanced machine learning techniques for suggesting relevant PubMed articles with high accuracy. In addition to innovative text-processing methods, LitSuggest offers multiple advantages over existing tools. First, LitSuggest allows users to curate, organize, and download classification results in a single interface. Second, users can easily fine-tune LitSuggest results by updating the training corpus. Third, results can be readily shared, enabling collaborative analysis and curation of scientific literature. Finally, LitSuggest provides an automated personalized weekly digest of newly published articles for each user's project. LitSuggest is publicly available at https://www.ncbi.nlm.nih.gov/research/litsuggest.
    DOI:  https://doi.org/10.1093/nar/gkab326
  3. Curr Dev Nutr. 2021 Feb;5(2): nzab002
      We assessed the quality of online health and nutrition information using a Google™ search on "supplements for cancer". Search results were scored using the Health Information Quality Index (HIQI), a quality-rating tool consisting of 12 objective criteria related to website domain, lack of commercial aspects, and authoritative nature of the health and nutrition information provided. Possible scores ranged from 0 (lowest) to 12 ("perfect" or highest quality). After eliminating irrelevant results, the remaining 160 search results had median and mean scores of 8. One-quarter of the results were of high quality (score of 10-12). There was no correlation between high-quality scores and early appearance in the sequence of search results, where results are presumably more visible. Also, 496 advertisements, over twice the number of search results, appeared. We conclude that the Google™ search engine may have shortcomings when used to obtain information on dietary supplements and cancer.
    Keywords:  Google search; Health Information Quality Index (HIQI); advertisement; authoritative information; commercial; nutrition; online; ranking
    DOI:  https://doi.org/10.1093/cdn/nzab002
  4. Annu Rev Biomed Data Sci. 2020 Jul;3 23-41
      Knowledge-based biomedical data science involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey recent progress in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as progress on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing to construct knowledge graphs, and the expansion of novel knowledge-based approaches to clinical and biological domains.
    Keywords:  Semantic Web; knowledge discovery; knowledge graph; knowledge graph embeddings; natural language processing; ontology
    DOI:  https://doi.org/10.1146/annurev-biodatasci-010820-091627
  5. PeerJ Comput Sci. 2021 ;7 e445
       Background: Results of scientific experiments and research work, either conducted by individuals or organizations, are published and shared with scientific community in different types of scientific publications such as books, chapters, journals, articles, reference works and reference works entries. One aspect of these documents is their contents and the other is metadata. Metadata of scientific documents could be used to increase mutual cooperation, find people with common interest and research work, and to find scientific documents in the matching domains. The major issue in getting these benefits from metadata of scientific publications is availability of these data in unstructured (or semi-structured) format so that it can not be used to ask smart queries that can help in computing and performing different types of analysis on scientific publications data. Also, acquisition and smart processing of publications data is a complicated as well as time and resource consuming task.
    Methods: To address this problem we have developed a generic framework named as Linked Open Publications Data Framework (LOPDF). The LOPDF framework can be used to crawl, process, extract and produce machine understandable data (i.e., LOD) about scientific publications from different publisher specific sources such as portals, XML export and websites. In this paper we present the architecture, process and algorithm that we developed to process textual publications data and to produce semantically enriched data as RDF datasets (i.e., open data).
    Results: The resulting datasets can be used to make smart queries by making use of SPARQL protocol. We also present the quantitative as well as qualitative analysis of our resulting datasets which ultimately can be used to compute the research behavior of organizations in rapidly growing knowledge society. Finally, we present the potential usage of producing and processing such open data of scientific publications and how results of performing smart queries on resulting open datasets can be used to compute the impact and perform different types of analysis on scientific publications data.
    Keywords:  Algorithms analysis; Digital libraries; Ontological reasoning; Open data
    DOI:  https://doi.org/10.7717/peerj-cs.445
  6. AMIA Annu Symp Proc. 2020 ;2020 1031-1040
      This year less than 200 National Library of Medicine indexers expect to index 1 million articles, and this would not be possible without the assistance of the Medical Text Indexer (MTI) system. MTI is an automated indexing system that provides MeSH main heading/subheading pair recommendations to assist indexers with their heavy workload. Over the years, a lot of research effort has focused on improving main heading prediction performance, but automated fine-grained indexing with main heading/subheading pairs has received much less attention. This work revisits the subheading attachment problem, and demonstrates very significant performance improvements using modern Convolutional Neural Network classifiers. The best performing method is shown to outperform the current MTI implementation with a 3.7% absolute improvement in precision, and a 27.6% absolute improvement in recall. We also conducted a manual review of false positive predictions, and 70% were found to be acceptable indexing.
  7. F1000Res. 2021 ;pii: Chem Inf Sci-294. [Epub ahead of print]10
      As chemical information evolves, impacting many chemistry areas, effective ways to disseminate results by the scientific community are also changing. Thus, publication schemes adapt to meet the needs of researchers across disciplines to share high-quality data, information, and knowledge. Since 2015, the F1000Research Chemical Information Science (CIS) gateway has offered an open and unique model to disseminate science at the interface of chemoinformatics, bioinformatics, and several other informatic-related disciplines. In response to the evolution of chemical information science, the F1000Research CIS gateway has incorporated new members to the advisory board. It is also reinforcing and expanding the gateway areas with a particular focus on machine learning and metabolomics. The range of available article types, availability of data, exposure within complementary multidisciplinary F1000Research gateways, and indexing in major bibliographic databases increases the visibility of all contributions. As part of progressing open science in this field, we look forward to your high-quality contributions to the CIS gateway.
    Keywords:  Chemical Information science; bioinformatics; chemoinformatics; informatics; machine learning; metabolomics; open science
    DOI:  https://doi.org/10.12688/f1000research.52192.1
  8. Database (Oxford). 2021 May 01. pii: baab024. [Epub ahead of print]2021
      Cannabis is one of the most versatile genera in terms of plant uses and has been exploited by humans for millennia due to its medicinal properties, strong fibres, nutritious seeds and psychoactive resin. Nowadays, Cannabis is the centre of many scientific studies, which mainly focus on its chemical composition and medicinal properties. Unfortunately, while new applications of this plant are continuously being developed, some of its traditional uses are becoming rare and even disappearing altogether. Information on traditional uses of Cannabis is vast, but it is scattered across many publication sources in different formats, so synthesis and standardization of these data are increasingly important. The CANNUSE database provides an organized information source for scientists and general public interested in different aspects of Cannabis use. It contains over 2300 entries from 649 publications related to medicinal, alimentary, fibre and other uses from different geographical areas and cultures around the world. We believe this database will serve as a starting point for new research and development strategies based on the traditional knowledge. Database URL: http://cannusedb.csic.es.
    DOI:  https://doi.org/10.1093/database/baab024
  9. AMIA Annu Symp Proc. 2020 ;2020 452-461
      New medical research concerning the spine and its diseases are incrementally made available through biomedical literature repositories. Several Natural Language Processing (NLP) tasks, like Semantic Role Labelling (SRL) and Information Extraction (IE), can offer support for, automatically, extracting relevant information about spine, from scientific papers. This paper presents a domain-specific FrameNet, called SpiNet, for automatic information extraction about spine concepts and their semantic types. For this, we use the frame semantic and the MeSH ontology in order to extract the relevant information about a disease, a treatment, a medication, a sign or symptom, related to spine medical domain. The differential of this work is the enrichment of SpiNet's base with the MeSH ontology, whose terms, concepts, descriptors and semantic types enable automatic semantic annotation. We use the SpiNet framework in order to annotate one hundred of scientific papers and the F1-score metric, calculated between the classification of relevant sentences performed by the system and the human physiotherapists, achieved the result of 0.83.
  10. AMIA Annu Symp Proc. 2020 ;2020 813-822
      It is difficult to arrive at an efficient and widely acceptable set of common data elements (CDEs). Trial outcomes, as defined in a clinical trial registry, offer a large set of elements to analyze. However, all clinical trial outcomes is an overwhelming amount of information. One way to reduce this amount of data to a usable volume is to only use a subset of trials. Our method uses a subset of trials by considering trials that support drug approval (pivotal trials) by Food and Drug Administration. We identified a set of pivotal trials from FDA drug approval documents and used primary outcomes data for these trials to identify a set of important CDEs. We identified 76 CDEs out of a set of 172 data elements from 192 pivotal trials for 100 drugs. This set of CDEs, grouped by medical condition, can be considered as containing the most significant data elements.
  11. JMIR Med Inform. 2021 May 06. 9(5): e28413
       BACKGROUND: Improving the understandability of health information can significantly increase the cost-effectiveness and efficiency of health education programs for vulnerable populations. There is a pressing need to develop clinically informed computerized tools to enable rapid, reliable assessment of the linguistic understandability of specialized health and medical education resources. This paper fills a critical gap in current patient-oriented health resource development, which requires reliable and accurate evaluation instruments to increase the efficiency and cost-effectiveness of health education resource evaluation.
    OBJECTIVE: We aimed to translate internationally endorsed clinical guidelines to machine learning algorithms to facilitate the evaluation of the understandability of health resources for international students at Australian universities.
    METHODS: Based on international patient health resource assessment guidelines, we developed machine learning algorithms to predict the linguistic understandability of health texts for Australian college students (aged 25-30 years) from non-English speaking backgrounds. We compared extreme gradient boosting, random forest, neural networks, and C5.0 decision tree for automated health information understandability evaluation. The 5 machine learning models achieved statistically better results compared to the baseline logistic regression model. We also evaluated the impact of each linguistic feature on the performance of each of the 5 models.
    RESULTS: We found that information evidentness, relevance to educational purposes, and logical sequence were consistently more important than numeracy skills and medical knowledge when assessing the linguistic understandability of health education resources for international tertiary students with adequate English skills (International English Language Testing System mean score 6.5) and high health literacy (mean 16.5 in the Short Assessment of Health Literacy-English test). Our results challenge the traditional views that lack of medical knowledge and numerical skills constituted the barriers to the understanding of health educational materials.
    CONCLUSIONS: Machine learning algorithms were developed to predict health information understandability for international college students aged 25-30 years. Thirteen natural language features and 5 evaluation dimensions were identified and compared in terms of their impact on the performance of the models. Health information understandability varies according to the demographic profiles of the target readers, and for international tertiary students, improving health information evidentness, relevance, and logic is critical.
    Keywords:  PEMAT; health education; machine learning; patient-oriented; understandability evaluation
    DOI:  https://doi.org/10.2196/28413
  12. Cleft Palate Craniofac J. 2021 May 07. 10556656211013177
       OBJECTIVE: It is important for health care education materials to be easily understood by caretakers of children requiring craniofacial surgery. This study aimed to analyze the readability of Google search results as they pertain to "Cleft Palate Surgery" and "Palatoplasty." Additionally, the study included a search from several locations globally to identify possible geographic differences.
    DESIGN: Google searches of the terms "Cleft Palate Surgery" and "Palatoplasty" were performed. Additionally, searches of only "Cleft Palate Surgery" were run from several internet protocol addresses globally.
    MAIN OUTCOME MEASURES: Flesch-Kincaid Grade Level and Readability Ease, Gunning Fog Index, Simple Measure of Gobbledygook (SMOG) index, and Coleman-Liau Index.
    RESULTS: Search results for "Cleft Palate Surgery" were easier to read and comprehend compared to search results for "Palatoplasty." Mean Flesch-Kincaid Grade Level scores were 7.0 and 10.11, respectively (P = .0018). Mean Flesch-Kincaid Reading Ease scores were 61.29 and 40.71, respectively (P = .0003). Mean Gunning Fog Index scores were 8.370 and 10.34, respectively (P = .0458). Mean SMOG Index scores were 6.84 and 8.47, respectively (P = .0260). Mean Coleman-Liau Index scores were 12.95 and 15.33, respectively (P = .0281). No significant differences were found in any of the readability measures based on global location.
    CONCLUSIONS: Although some improvement can be made, craniofacial surgeons can be confident in the online information pertaining to cleft palate repair, regardless of where the search is performed from. The average readability of the top search results for "Cleft Palate Surgery" is around the seventh-grade reading level (US educational system) and compares favorably to other health care readability analyses.
    Keywords:  cleft palate; cleft repair; online information; palatoplasty; patient information; readability
    DOI:  https://doi.org/10.1177/10556656211013177
  13. J Am Acad Dermatol. 2021 May 02. pii: S0190-9622(21)00935-X. [Epub ahead of print]
      
    Keywords:  Readability; hives; online health resources; patient education; quality; timeliness; urticaria
    DOI:  https://doi.org/10.1016/j.jaad.2021.04.089
  14. BMC Med Inform Decis Mak. 2021 May 05. 21(1): 149
       PURPOSE: Transjugular intrahepatic portosystemic shunt (TIPS) procedure is an established procedure carried out by interventional radiologists to achieve portal decompression and to manage the complications of portal hypertension. The aim of this study was to evaluate the quality and readability of information available online for TIPS procedure.
    METHODS: Websites were identified using the search terms "TIPS procedure", "TIPSS procedure", "transjugular intrahepatic portosystemic shunt procedure", with the first 25 pages from the three most popular search engines (Google, Bing and Yahoo) being selected for evaluation with a total of 225. Each Website was grouped by authorship into one of five categories: (1) Physician, (2) Academic, (3) For-profit, (4) Non-profit (including government and public health), or (5) Other (discussion/social media). Readability of each Website was assessed using the Flesch-Reading Ease score, Flesch-Kincaid grade level, Gunning-Fog Index, Coleman-Liau and SMOG index. Quality was calculated using the DISCERN instrument, the Journal of the American Medical Association (JAMA) benchmark criteria and the presence of Health on the Net (HON) code certification.
    RESULTS: After disregarding duplicate and non-accessible Websites a total of 81 were included. The mean DISCERN score assessing the quality of information provided by Websites was "good" (59.3 ± 10.2) with adherence to the JAMA Benchmark being 54.3%. Websites with HON-code certification were statistically significantly higher in terms of DISCERN (p = 0.034) and JAMA scores (p = 0.003) compared to HON-code negative sites. The readability scores of Websites ranged from 10 to 12th grade across calculators. Thirty-two out of the 81 Websites were targeted towards patients (39.5%), 46 towards medical professionals (56.8%) and 3 were aimed at neither (3.7%). The medical professional aimed Websites were statistically significantly more difficulty to read across all readability formulas (all p < 0.001).
    CONCLUSION: While quality of online information available to patients is "good", the average readability for information on the internet for TIPS is set far above the recommended 7th-grade level. Academic Websites were of the highest quality, yet most challenging for the general public to read. These findings call for the production of high-quality and comprehensible content around TIPS procedure, where physicians can reliably direct their patients for information.
    Keywords:  Consumer health information; DISCERN; JAMA; Online; Readability; TIPS procedure
    DOI:  https://doi.org/10.1186/s12911-021-01513-x
  15. Health Commun. 2021 May 03. 1-9
      The goal of this study was to investigate the effects of sources of information on COVID-19 risk perceptions. Using data from a representative sample of the Portuguese population (N = 1,411) collected early in the pandemic, we find that while media sources were more frequently used, scientific sources played a more important role on perceived personal and societal-level risks; higher trust in scientific sources associated with increased risk perceptions (i.e., amplified perceived risk), trust in social media associated with dismissing personal threat (i.e., attenuated perceived risk). These findings suggest that people's relations with science were determinant factors in risk perceptions, and dimensions that measure these deserve further investigation.
    DOI:  https://doi.org/10.1080/10410236.2021.1914915
  16. Z Gesundh Wiss. 2021 Apr 26. 1-15
       Aim: Extensive COVID-19 information can generate information overload and confusion. Denmark and Sweden adopted different COVID-19 management strategies.
    Aim: This study aimed to compare search strategies, perceptions and effects of COVID-19 information, in general and specifically in social media, in residents in Denmark and Sweden.
    Subject and methods: Quantitative data from a sample of respondents (n = 616) from Denmark and Sweden on an international web-based survey was analysed using descriptive and analytical statistics.
    Results: The results showed similarities between the countries regarding preferred and trusted information sources, use of (social) media, and psychosocial and behavioural effects of such information. Traditional media and social media were frequently used for COVID-19 information. Especially health authorities and researchers were trusted sources, representing the dominant medico-political discourse. There were no differences in negative effect and social behaviour. Residents in Denmark experienced significantly more positive effects than residents in Sweden.
    Conclusion: Summarily, the study showed similarities and small differences among residents in both countries related to usage patterns, perceptions and effects of COVID-19 information from (social) media, despite diverging strategies.
    Keywords:  COVID-19 information; Denmark; Psychosocial effects; Social media; Survey; Sweden
    DOI:  https://doi.org/10.1007/s10389-021-01539-5
  17. Transl Behav Med. 2021 May 05. pii: ibab038. [Epub ahead of print]
      Sleep problems are prevalent in early childhood, with the majority of caregivers desiring to change something about their child's sleep. Quality-assured education and resources are needed to be related to infant and toddler sleep. This article describes the development and dissemination of a global consumer health information website (http://www.babysleep.com) by the Pediatric Sleep Council to provide publicly accessible evidence-based information and resources for caregivers and practitioners. The website includes sleep health-related information and resources. Three phases, including the launch, social media strategy, and search engine optimization, for promotion and dissemination of the site was implemented. Analysis of dissemination indicates exponential growth of the site since its launch. With access across the globe, the site has developed from its inception into a widely-used resource, with over 800,000 users from around the world (99% of countries).
    Keywords:  Education; Infant; Parents; Sleep; Toddler; Website
    DOI:  https://doi.org/10.1093/tbm/ibab038
  18. JMIR Cancer. 2021 May 07. 7(2): e25357
       BACKGROUND: Thousands of web searches are performed related to transarterial chemoembolization (TACE), given its palliative role in the treatment of liver cancer.
    OBJECTIVE: This study aims to assess the reliability, quality, completeness, readability, understandability, and actionability of websites that provide information on TACE for patients.
    METHODS: The five most popular keywords pertaining to TACE were searched on Google, Yahoo, and Bing. General website characteristics and the presence of Health On the Net Foundation code certification were documented. Website assessment was performed using the following scores: DISCERN, Journal of the American Medical Association, Flesch-Kincaid Grade Level, Flesch Reading Ease Score, and the Patient Education Materials Assessment Tool. A novel TACE content score was generated to evaluate website completeness.
    RESULTS: The search yielded 3750 websites. In total, 81 website entities belonging to 78 website domains met the inclusion criteria. A medical disclaimer was not provided on 28% (22/78) of website domains. Health On the Net code certification was present on 12% (9/78) of website domains. Authorship was absent on 88% (71/81) of websites, and sources were absent on 83% (67/81) of websites. The date of publication or of the last update was not listed on 58% (47/81) of websites. The median DISCERN score was 47.0 (IQR 40.5-54.0). The median TACE content score was 35 (IQR 27-43). The median readability grade level was in the 11th grade. Overall, 61% (49/81) and 16% (13/81) of websites were deemed understandable and actionable, respectively. Not-for-profit websites fared significantly better on the Journal of the American Medical Association, DISCERN, and TACE content scores.
    CONCLUSIONS: The content referring to TACE that is currently available on the web is unreliable, incomplete, difficult to read, understandable but not actionable, and characterized by low overall quality. Websites need to revise their content to optimally educate consumers and support shared decision-making.
    TRIAL REGISTRATION: PROSPERO International Prospective Register of Systematic Reviews CRD42020202747; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42020202747.
    Keywords:  hepatocellular carcinoma; internet; interventional oncology; interventional radiology; liver cancer; patient education; systematic review; transarterial chemoembolization
    DOI:  https://doi.org/10.2196/25357
  19. J Pediatr Oncol Nurs. 2021 May 07. 10434542211011045
      Background: Health literacy may influence the transition from pediatric care to adult care in adolescents with sickle cell disease (SCD). It is postulated that one influencing factor of health literacy in adolescents with SCD is health-seeking behavior. The purpose of this study was twofold: (1) to explore health-seeking behaviors of adolescents with SCD and (2) to determine if there are significant differences in health literacy levels of adolescents with SCD based upon health-seeking behaviors. Methods: This was a cross-sectional, descriptive study evaluating health-seeking behaviors and health literacy in 110 Black and non-Hispanic adolescents with SCD. Convenience sampling was utilized for recruitment. The inclusion criteria were a diagnosis of one of the four primary genotypes of SCD and age of 10-19 years. Health literacy was evaluated using the Newest Vital Sign (NVS). Frequencies and percentages were calculated for all variables. Independent Samples t-tests were conducted to evaluate differences in health literacy scores based upon differing health-seeking behaviors. Results: The mean age of participants was 14.8 years (SD = 2.2). The mean NVS score was 2.7 (SD = 1.6). The two most common responses to "where do you go FIRST for health information?" were the Internet (29.6%; n = 40) and health care providers (27.4%; n = 37). There was no statistical difference in NVS scores between adolescents using the Internet versus health care providers as their first source of health information (t[75] = - .12; p = .22). Discussion: Knowledge of health-seeking behaviors and health literacy in adolescents with SCD gives insight into the design and evaluation of future interventions to improve health and health literacy in this population.
    Keywords:  adolescents; adolescents and young adults (AYA); health literacy; sickle cell disease
    DOI:  https://doi.org/10.1177/10434542211011045