bims-librar Biomed News
on Biomedical librarianship
Issue of 2020–09–20
sixteen papers selected by
Thomas Krichel, Open Library Society



  1. J Am Med Inform Assoc. 2020 Sep 17. pii: ocaa163. [Epub ahead of print]
       OBJECTIVE: Randomized controlled trials (RCTs) are the gold standard method for evaluating whether a treatment works in health care but can be difficult to find and make use of. We describe the development and evaluation of a system to automatically find and categorize all new RCT reports.
    MATERIALS AND METHODS: Trialstreamer continuously monitors PubMed and the World Health Organization International Clinical Trials Registry Platform, looking for new RCTs in humans using a validated classifier. We combine machine learning and rule-based methods to extract information from the RCT abstracts, including free-text descriptions of trial PICO (populations, interventions/comparators, and outcomes) elements and map these snippets to normalized MeSH (Medical Subject Headings) vocabulary terms. We additionally identify sample sizes, predict the risk of bias, and extract text conveying key findings. We store all extracted data in a database, which we make freely available for download, and via a search portal, which allows users to enter structured clinical queries. Results are ranked automatically to prioritize larger and higher-quality studies.
    RESULTS: As of early June 2020, we have indexed 673 191 publications of RCTs, of which 22 363 were published in the first 5 months of 2020 (142 per day). We additionally include 304 111 trial registrations from the International Clinical Trials Registry Platform. The median trial sample size was 66.
    CONCLUSIONS: We present an automated system for finding and categorizing RCTs. This yields a novel resource: a database of structured information automatically extracted for all published RCTs in humans. We make daily updates of this database available on our website (https://trialstreamer.robotreviewer.net).
    Keywords:  automatic database curation; evidence based medicine; randomized controlled trials; research synthesis
    DOI:  https://doi.org/10.1093/jamia/ocaa163
  2. J R Coll Physicians Edinb. 2020 Sep;50(3): 316-321
       BACKGROUND: A well-written manuscript published in a reputable journal is the deserved end-point of good research. It is important for postgraduates to be trained in scientific writing for their academic progression as well as the advancement of science.
    METHODS: A day-long workshop on scientific writing and publication was conducted at Raipur, India in February 2020. The medical postgraduate (UK equivalent: Core Medical Trainee) participants were engaged with lectures, discussions and a practical session requiring critical appraisal of a manuscript. The lectures also discussed publication ethics and the perils of falling prey to predatory journals. Pre and post-workshop surveys were given to the participants to assess the impact of the workshop on the baseline knowledge of scientific writing and publishing.
    RESULTS: Out of 69 participants, there were 67 (response rate 97%) and 41 (response rate 59%) respondents to the pre and post-workshop surveys respectively. The former identified a lack of baseline knowledge ranging from 6% for determining the components of the individual sections of the manuscript such as Introduction or Methods, 40% for the use of acronyms, and 55% for knowledge of different referencing styles, to 61% for knowledge of indexing agencies. The post-workshop survey revealed improvement in participants' knowledge of the contents of various sections of the manuscript and their knowledge about referencing styles and indexing agencies. In the post-workshop survey, 20% of respondents said that they would be open to engaging with predatory journals, which underscored the need to educate them continuously regarding the demerits of such practice. Participants expressed the need for longer workshops, preferably spread over two days, with discussion on research methodology and statistical analysis, and more 'hands-on' sessions.
    CONCLUSION: This survey underscores the need for structured training in scientific writing. Its inclusion in the medical postgraduate curriculum appears desirable.
    Keywords:  indexed journals; journal selection; manuscript writing; paper writing; predatory journals; publishing
    DOI:  https://doi.org/10.4997/JRCPE.2020.323
  3. Methods. 2020 Sep 15. pii: S1046-2023(20)30197-3. [Epub ahead of print]
      Analyzing disease-disease relationships plays an important role for understanding disease mechanisms and finding alternative uses for a drug. A disease is usually the result of abnormal state of multiple molecular process. Since biological networks can model the interplay of multiple molecular processes, network-based methods have been proposed to uncover the disease-disease relationships recently. Given a disease and a network, the disease could be represented as a subnetwork constructed by the disease genes involved in the given network, named disease subnetwork. Because it is difficult to learn the feature representation of disease subnetworks, most existing methods are unsupervised ones without using labeled information. To fill this gap, we propose a novel method named SubNet2vec to learn the feature vectors of diseases from their corresponding subnetwork in the biological network. By utilizing the feature representation of disease subnetwork, we can analyze disease-disease relationships in a supervised fashion. The evaluation results show that the proposed framework outperforms some state-of-the-art approaches in a large margin on disease-disease/disease-drug association prediction. The source code and data are available at https://github.com/MedicineBiology-AI/SubNet2vec.git.
    Keywords:  Disease associations analysis; Protein-protein interaction network; Subnetwork representation learning
    DOI:  https://doi.org/10.1016/j.ymeth.2020.09.002
  4. J Am Med Inform Assoc. 2020 Sep 17. pii: ocaa128. [Epub ahead of print]
       OBJECTIVE: We sought to assess the need for additional coverage of dietary supplements (DS) in the Unified Medical Language System (UMLS) by investigating (1) the overlap between the integrated DIetary Supplements Knowledge base (iDISK) DS ingredient terminology and the UMLS and (2) the coverage of iDISK and the UMLS over DS mentions in the biomedical literature.
    MATERIALS AND METHODS: We estimated the overlap between iDISK and the UMLS by mapping iDISK to the UMLS using exact and normalized strings. The coverage of iDISK and the UMLS over DS mentions in the biomedical literature was evaluated via a DS named-entity recognition (NER) task within PubMed abstracts.
    RESULTS: The coverage analysis revealed that only 30% of iDISK terms can be matched to the UMLS, although these cover over 99% of iDISK concepts. A manual review revealed that a majority of the unmatched terms represented new synonyms, rather than lexical variants. For NER, iDISK nearly doubles the precision and achieves a higher F1 score than the UMLS, while maintaining a competitive recall.
    DISCUSSION: While iDISK has significant concept overlap with the UMLS, it contains many novel synonyms. Furthermore, almost 3000 of these overlapping UMLS concepts are missing a DS designation, which could be provided by iDISK. The NER experiments show that the specialization of iDISK is useful for identifying DS mentions.
    CONCLUSIONS: Our results show that the DS representation in the UMLS could be enriched by adding DS designations to many concepts and by adding new synonyms.
    Keywords:  dietary supplements; named entity recognition, natural language processing; terminology; unified medical language system
    DOI:  https://doi.org/10.1093/jamia/ocaa128
  5. J Am Med Inform Assoc. 2020 Sep 17. pii: ocaa176. [Epub ahead of print]
       OBJECTIVE: Patients that undergo medical transfer represent 1 patient population that remains infrequently studied due to challenges in aggregating data across multiple domains and sources that are necessary to capture the entire episode of patient care. To facilitate access to and secondary use of transport patient data, we developed the Transport Data Repository that combines data from 3 separate domains and many sources within our health system.
    METHODS: The repository is a relational database anchored by the Unified Medical Language System unique concept identifiers to integrate, map, and standardize the data into a common data model. Primary data domains included sending and receiving hospital encounters, medical transport record, and custom hospital transport log data. A 4-step mapping process was developed: 1) automatic source code match, 2) exact text match, 3) fuzzy matching, and 4) manual matching.
    RESULTS: 431 090 total mappings were generated in the Transport Data Repository, consisting of 69 010 unique concepts with 77% of the data being mapped automatically. Transport Source Data yielded significantly lower mapping results with only 8% of data entities automatically mapped and a significant amount (43%) remaining unmapped.
    DISCUSSION: The multistep mapping process resulted in a majority of data been automatically mapped. Poor matching of transport medical record data is due to the third-party vendor data being generated and stored in a nonstandardized format.
    CONCLUSION: The multistep mapping process developed and implemented is necessary to normalize electronic health data from multiple domains and sources into a common data model to support secondary use of data.
    Keywords:  data curation; data management; data warehousing, transportation of patients ; electronic data processing
    DOI:  https://doi.org/10.1093/jamia/ocaa176
  6. JMIR Public Health Surveill. 2020 Sep 09.
       BACKGROUND: The COVID-19 pandemic has led to a heightened need to understand health information seeking behaviors in order to address disparities in knowledge and beliefs about the crisis.
    OBJECTIVE: This study assessed socio-demographic predictors of the use and trust of different COVID-19 information sources, and the association between information sources, knowledge and beliefs about the pandemic.
    METHODS: An online survey was conducted among U.S. adults in two rounds within March-April 2020 using social-media advertisement-based recruitment. Participants were asked on their use of eleven different COVID-19 information sources as well as their most trusted source of information. Selection of COVID-related knowledge and belief questions was identified using past empirical literature and salient concerns at the time of survey implementation.
    RESULTS: The sample consists of 11,242 participants. Traditional media sources (TV, radio or podcasts, or newspapers) when combined were the largest sources of COVID-19 information (91.2%). Among those using mainstream media sources for COVID-19 information (n=7,811, 69.5%), popular outlets included CNN (24.0%), Fox News (19.3%), and other local or national networks (35.2%). The largest individual information source was government websites (87.6%), which was also the most trusted source of information (43.3%); odds of trusting government websites were lower among males (AOR: 0.58, 95% CI:0.53-0.63) and those aged 40-59 and ≥60 years compared to aged 18-38 years (AOR: 0.83, 95% CI:0.74-0.92; AOR: 0.62, 95% CI:0.54-0.71). Participants used an average of 6.1 sources (SD: 2.3). Participants who were male, aged 40-59 or ≥60, not working/unemployed or retired, or Republican were likely to use fewer sources while those with children and with higher educational attainment were likely to use more sources. Participants in April were markedly less likely to use (AORuse 0.41, 95%CI:0.35-0.46) and trust (AORtrust 0.51, 95%CI:0.47-0.56) government sources. The association between information source and COVID-19 knowledge was mixed, while many COVID-19 beliefs were significantly predicted by information source; similar trends were observed with reliance on different types of mainstream media outlets.
    CONCLUSIONS: COVID-19 information source was significantly determined by participant socio-demographic characteristics and was also associated with both knowledge and beliefs about the pandemic. Study findings can help inform COVID-19 health communication campaigns and highlight the impact of using a variety of different and trusted information sources.
    DOI:  https://doi.org/10.2196/21071
  7. Am J Audiol. 2020 Sep 18. 29(3S): 623-630
      Purpose Hyperacusis is a disorder characterized by reduced sound tolerance leading to ear pain, emotional distress, and reduced quality of life. Many people with hyperacusis turn to the Internet for information and support from online communities to discuss their condition. The purpose of this study was to assess the content and quality of hyperacusis information presented online. Method The three most used Internet search engines were used to identify relevant websites using the single search term hyperacusis. Fifteen websites were selected for analysis. Details of the purpose, audience, and content of each website were extracted using a bespoke data extraction form. The quality of the information on each website was rated using the validated DISCERN questionnaire. Results There was a wide disparity in the quality and content of hyperacusis information across websites. The website Hyperacusis Focus achieved the highest overall DISCERN score. Hyperacusis Focus and U.K. National Health Service websites were the most comprehensive online resources for health care professionals and patients, respectively. Wikipedia was judged useful for both health care professionals and patients. In general, hyperacusis-related information was accurate. However, no single website provided a complete account of hyperacusis, and some were judged to be selective in the information they provided. Conclusions The Internet provides an important source of information for those who have hyperacusis and those who care for them. Revisions to the websites reviewed here are needed for each to provide a complete account of hyperacusis. Supplemental Material https://doi.org/10.23641/asha.12869717.
    DOI:  https://doi.org/10.1044/2020_AJA-19-00074
  8. Digit Health. 2020 Jan-Dec;6:6 2055207620948996
       Background: Internet represents a relevant source of information, but reliability of data that can be obtained by the web is still an unsolved issue. Non-reliable online information may have a relevance, especially in taking decisions related to health problems. Uncertainties on the quality of online health data may have a negative impact on health-related choices of citizens.
    Objective: This work consisted in a cross-sectional literature review of published papers on online health information. The two main research objectives consisted in the analysis of trends in the use of health web sites and in the quality assessment and reliability levels of web medical sites.
    Methods: Literature research was made using four digital reference databases, namely PubMed, British Medical Journal, Biomed, and CINAHL. Entries used were "trustworthy of medical information online," "survey to evaluate medical information online," "medical information online," and "habits of web-based health information users". Analysis included only papers published in English. The Newcastle Ottawa Scale was used to conduct quality checks of selected works.
    Results: Literature analysis using the above entries resulted in 212 studies. Twenty-four articles in line with study objectives, and user characteristics were selected. People more prone to use the internet for obtaining health information were females, younger people, scholars, and employees. Reliability of different online health sites is an issue taken into account by the majority of people using the internet for obtaining health information and physician assistance could help people to surf more safe health web sites.
    Conclusions: Limited health information and/or web literacy can cause misunderstandings in evaluating medical data found in the web. An appropriate education plan and evaluation tools could enhance user skills and bring to a more cautious analysis of health information found in the web.
    Keywords:  Health web sites; health literacy; medical information; quality of internet sites
    DOI:  https://doi.org/10.1177/2055207620948996
  9. Gigascience. 2020 Sep 14. pii: giaa097. [Epub ahead of print]9(9):
       BACKGROUND: Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined, and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for more than a decade and have been implementing software and databases to meet this challenge. The GMOD's (Generic Model Organism Database) biological relational database schema, known as Chado, is one of the few successful open source initiatives; it is widely adopted and many software packages are able to connect to it.
    FINDINGS: We have been developing an open source software package named Machado, a genomics data integration framework implemented in Python, to enable research groups to both store and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on top of already existing databases. It has several data-loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL, and LSTrAP. There is an API to connect to JBrowse, and a web visualization tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a Google-like search, i.e., single auto-complete search box that provides fast results and filters.
    CONCLUSION: Machado aims to be a modern object-relational framework that uses the latest Python libraries to produce an effective open source resource for genomics research.
    Keywords:  Chado; Python; database; multiomics
    DOI:  https://doi.org/10.1093/gigascience/giaa097
  10. Sci Rep. 2020 Sep 15. 10(1): 15109
      Better understanding of molecular mechanisms for kidney stone formation is required to improve management of kidney stone disease with better therapeutic outcome. Recent kidney stone research has indicated critical roles of a group of proteins, namely 'stone modulators', in promotion or inhibition of the stone formation. Nevertheless, such information is currently dispersed and difficult to obtain. Herein, we present the kidney stone modulator database (StoneMod), which is a curated resource by obtaining necessary information of such stone modulatory proteins, which can act as stone promoters or inhibitors, with experimental evidence from previously published studies. Currently, the StoneMod database contains 10, 16, 13, 8 modulatory proteins that affect calcium oxalate crystallization, crystal growth, crystal aggregation, and crystal adhesion on renal tubular cells, respectively. Informative details of each modulatory protein and PubMed links to the published articles are provided. Additionally, hyperlinks to other protein/gene databases (e.g., UniProtKB, Swiss-Prot, Human Protein Atlas, PeptideAtlas, and Ensembl) are made available for the users to obtain additional in-depth information of each protein. Moreover, this database provides a user-friendly web interface, in which the users can freely access to the information and/or submit their data to deposit or update. Database URL: https://www.stonemod.org .
    DOI:  https://doi.org/10.1038/s41598-020-71730-3
  11. J Med Internet Res. 2020 Sep 15. 22(9): e20632
       BACKGROUND: Oral contraceptives (OCs) are a unique chronic medication with which a memory slip may result in a threat that could change a person's life course. Subjective concerns of missed OC doses among women have been addressed infrequently. Anonymized queries to internet search engines provide unique access to concerns and information gaps faced by a large number of internet users.
    OBJECTIVE: We aimed to quantitate the frequency of queries by women seeking information in an internet search engine, after missing one or more doses of an OC; their further queries on emergency contraception, abortion, and miscarriage; and their rate of reporting a pregnancy timed to the cycle of missing an OC.
    METHODS: We extracted all English-language queries submitted to Bing in the United States during 2018, which mentioned a missed OC and subsequent queries of the same users on miscarriage, abortion, emergency contraceptives, and week of pregnancy.
    RESULTS: We identified 26,395 Bing users in the United States who queried about missing OC pills and the fraction that further queried about miscarriage, abortion, emergency contraceptive, and week of pregnancy. Users under the age of 30 years who asked about forgetting an OC dose were more likely to ask about abortion (1.5 times) and emergency contraception (1.7 times) (P<.001 for both), while users at ages of 30-34 years were more likely to query about pregnancy (2.1 times) and miscarriage (5.4 times) (P<.001 for both).
    CONCLUSIONS: Our data indicate that many women missing a dose of OC might not have received sufficient information from their health care providers or chose to obtain it online. Queries about abortion and miscarriage peaking in the subsequent days indicate a common worry of possible pregnancy. These results reinforce the importance of providing comprehensive written information on missed pills when prescribing an OC.
    Keywords:  abortion; birth control; miscarriage; search engines
    DOI:  https://doi.org/10.2196/20632
  12. JSES Int. 2020 Sep;4(3): 449-452
       Hypothesis and/or Background: When examining the access and content related to shoulder and elbow fellowship websites, only 64% of programs had individual websites in a query performed 5 years earlier. The purpose of this study was to re-evaluate content about individual programs listed on the American Shoulder and Elbow Surgeons (ASES) website and on individual program websites and compare the results to prior data.
    Methods: The ASES website was accessed to determine both the number of ASES-recognized shoulder and elbow fellowships and the number of direct links to fellowship program websites. A Google search was also performed to determine the ease of access to fellowship program websites. Each website was then evaluated for content in regard to their recruitment and educational program.
    Results: The ASES website includes contact information and a brief description for 29 programs with 40 reported positions. When trying to identify links to program websites, there were functioning links to 6 programs (21%) and absent/nonfunctioning links for the remaining 23 (79%). Through a Google search, there were functioning links to 22 (76%) and absent/nonfunctioning links for 7 (24%) programs. All 29 program websites had faculty listing and program contact info whereas 28 (97%) had a description of their program. In terms of educational content, 17 (59%) included description of operative cases and 18 (62%) had descriptions of rotations/curriculum.
    Discussion and/or Conclusion: Individual shoulder and elbow fellowship program websites provide varied content and accessibility. In the intervening 5 years, there has been minimal improvement in the accessibility of individual fellowship websites from the ASES website.
    Keywords:  Fellowship; Internet; elbow; medical education; shoulder; website
    DOI:  https://doi.org/10.1016/j.jseint.2020.04.011
  13. West J Nurs Res. 2020 Sep 18. 193945920959086
      Caregivers may receive information at a rate far higher than their individual abilities to process. Hence, caregivers can cause less desirable health outcomes for their care recipients. This study sought to identify caregiver information overload in comparison to noncaregivers. Relating factors such as caregiving contexts, health status, and personal health literacy were also compared between caregivers and noncaregivers. Using a nationally representative survey, the Health Information National Trends Survey, the differences between caregivers and noncaregivers regarding information overload were compared. A total of 2,918 noncaregivers and 484 caregivers were identified. More than two-thirds of the study sample demonstrated information overload regardless of caregiving status. Male, less educated, lower income, married, and employed caregivers are likely overloaded with information. Caregivers with information overload show less healthy conditions and expressed more information seeking burden. Effective countermeasures of heavy information overload should be devised based on specific causes and their accompanying consequences.
    Keywords:  caregivers; health information national trends survey; information overload; personal health literacy
    DOI:  https://doi.org/10.1177/0193945920959086
  14. J Med Internet Res. 2020 Sep 16. 22(9): e20910
       BACKGROUND: Patients attempt to make appropriate decisions based on their own knowledge when choosing a doctor. In this process, the first question usually faced is that of how to obtain useful and relevant information. This study investigated the types of information sources that are used widely by patients in choosing a doctor and identified ways in which the preferred sources differ in various situations.
    OBJECTIVE: This study aims to address the following questions: (1) What is the proportion in which each of the various information sources is used? (2) How does the information source preferred by patients in choosing a doctor change when there is a difference in the difficulty of medical decision making, in the level of the hospital, or in a rural versus urban situation? (3) How do information sources used by patients differ when they choose doctors with different specialties?
    METHODS: This study overcomes a major limitation in the use of the survey technique by employing data from the Good Doctor website, which is now China's leading online health care community, data which are objective and can be obtained relatively easily and frequently. Multinomial logistic regression models were applied to examine whether the proportion of use of these information sources changes in different situations. We then used visual analysis to explore the question of which type of information source patients prefer to use when they seek medical assistance from doctors with different specialties.
    RESULTS: The 3 main information sources were online reviews (OR), family and friend recommendations (FR), and doctor recommendations (DR), with proportions of use of 32.93% (559,345/1,698,666), 23.68% (402,322/1,698,666), and 17.48% (296,912/1,698,666), respectively. Difficulty in medical decision making, the hospital level, and rural-urban differences were significantly associated with patients' preferred information sources for choosing doctors. Further, the sources of information that patients prefer to use were found to vary when they looked for doctors with different medical specialties.
    CONCLUSIONS: Patients are less likely to use online reviews when medical decisions are more difficult or when the provider is not a tertiary hospital, the former situation leading to a greater use of online reviews and the latter to a greater use of family and friend recommendations. In addition, patients in large cities are more likely to use information from online reviews than family and friend recommendations. Among different medical specialties, for those in which personal privacy is a concern, online reviews are the most common source. For those related to children, patients are more likely to refer to family and friend recommendations, and for those related to surgery, they value doctor recommendations more highly. Our results can not only contribute to aiding government efforts to further promote the dissemination of health care information but may also help health care industry managers develop better marketing strategies.
    Keywords:  decision making; doctor; health information; information source; online health care community; online reviews
    DOI:  https://doi.org/10.2196/20910