bims-librar Biomed News
on Biomedical librarianship
Issue of 2023‒11‒26
24 papers selected by
Thomas Krichel, Open Library Society



  1. J Fungi (Basel). 2023 Oct 30. pii: 1061. [Epub ahead of print]9(11):
      Libraries contain a large amount of organic material, frequently stored with inadequate climate control; thus, mold growth represents a considerable threat to library buildings and their contents. In this essay, we review published papers that have isolated microscopic fungi from library books, shelving, walls, and other surfaces, as well as from air samples within library buildings. Our literature search found 54 published studies about mold in libraries, 53 of which identified fungi to genus and/or species. In 28 of the 53 studies, Aspergillus was the single most common genus isolated from libraries. Most of these studies used traditional culture and microscopic methods for identifying the fungi. Mold damage to books and archival holdings causes biodeterioration of valuable educational and cultural resources. Exposure to molds may also be correlated with negative health effects in both patrons and librarians, so there are legitimate concerns about the dangers of contact with high levels of fungal contamination. Microbiologists are frequently called upon to help librarians after flooding and other events that bring water into library settings. This review can help guide microbiologists to choose appropriate protocols for the isolation and identification of mold in libraries and be a resource for librarians who are not usually trained in building science to manage the threat molds can pose to library holdings.
    Keywords:  Aspergillus; biodeterioration; filamentous fungi; indoor microbiomes; library; mold; mold isolation; mould
    DOI:  https://doi.org/10.3390/jof9111061
  2. Health Info Libr J. 2023 Dec;40(4): 341-342
      In this second special collection of COVID-19-related manuscripts, our focus moves from health information within academia to health librarianship in the wider context. Although COVID-19 manuscripts may still occasionally appear in the Health Information and Libraries Journal, the World Health Organisation's declaration earlier this year of an end to the global health emergency marks an intentional editorial shift to adopting a broader perspective in publishing this type of work, a focus on public health information challenges and emergency preparedness, and a return to publishing a more familiar range of health library and information contexts and practice.
    Keywords:  access to information; general practitioners (GPs); internet access; pandemic; public health
    DOI:  https://doi.org/10.1111/hir.12513
  3. Nucleic Acids Res. 2023 Nov 22. pii: gkad1085. [Epub ahead of print]
      Europe PMC (https://europepmc.org/) is an open access database of life science journal articles and preprints, which contains over 42 million abstracts and over 9 million full text articles accessible via the website, APIs and bulk download. This publication outlines new developments to the Europe PMC platform since the last database update in 2020 (1) and focuses on five main areas. (i) Improving discoverability, reproducibility and trust in preprints by indexing new preprint content, enriching preprint metadata and identifying withdrawn and removed preprints. (ii) Enhancing support for text and data mining by expanding the types of annotations provided and developing the Europe PMC Annotations Corpus, which can be used to train machine learning models to increase their accuracy and precision. (iii) Developing the Article Status Monitor tool and email alerts, to notify users about new articles and updates to existing records. (iv) Positioning Europe PMC as an open scholarly infrastructure through increasing the portion of open source core software, improving sustainability and accessibility of the service.
    DOI:  https://doi.org/10.1093/nar/gkad1085
  4. Nucleic Acids Res. 2023 Nov 22. pii: gkad1044. [Epub ahead of print]
      The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.
    DOI:  https://doi.org/10.1093/nar/gkad1044
  5. BMJ Evid Based Med. 2023 Nov 21. pii: bmjebm-2023-112678. [Epub ahead of print]
      
    Keywords:  Health Services Research; Methods; Systematic Reviews as Topic
    DOI:  https://doi.org/10.1136/bmjebm-2023-112678
  6. Disabil Rehabil Assist Technol. 2023 Nov 20. 1-10
      PURPOSE: The rising number of apps requires careful consideration in how these apps are being selected for students with extensive support needs in school-based settings. Current practices suggest that educational apps are being purchased without utilizing an evaluation tool to determine the quality or effectiveness of the apps. A systematic literature review was conducted to identify educational app evaluation tools for students with extensive support needs.MATERIALS AND METHODS: A three-phase search process (electronic database search, journal hand-search, and ancestral search) was conducted using 14 keywords to maximize the number of articles. A two-step coding procedure was conducted to ensure articles met the four inclusion criteria. A 15 criteria checklist was used to evaluate the methodological rigor of accepted articles.
    RESULTS: Findings focused on the type of app evaluation tools and their specific evaluation dimensions. A total of 107 articles were identified with 13 articles meeting the inclusion criteria. Stage 1 evaluated the methodological rigor of the app evaluation tools (M = 6.15, range 0.5 - 14). Stage 2 categorized the articles based on the type of evaluation tools (rubric = 5, rating scale = 6, checklist = 2). Stage 3 identified five evaluation dimensions (background, design features, usability, individualization, and overall impression).
    CONCLUSIONS: There is a lack of empirically tested evaluation tools for communication and educational apps, making it difficult to recommend a valid app evaluation tool. Thus, barriers are likely to persist in the effective identification of apps for students with extensive support needs.
    Keywords:  Apps; app evaluation; autism; extensive support needs; intellectual disabilities; severe disabilities; systematic literature review
    DOI:  https://doi.org/10.1080/17483107.2023.2283053
  7. Ophthalmic Plast Reconstr Surg. 2023 Nov 16.
      PURPOSE: To assess the accuracy and readability of responses generated by the artificial intelligence model, ChatGPT (version 4.0), to questions related to 10 essential domains of orbital and oculofacial disease.METHODS: A set of 100 questions related to the diagnosis, treatment, and interpretation of orbital and oculofacial diseases was posed to ChatGPT 4.0. Responses were evaluated by a panel of 7 experts based on appropriateness and accuracy, with performance scores measured on a 7-item Likert scale. Inter-rater reliability was determined via the intraclass correlation coefficient.
    RESULTS: The artificial intelligence model demonstrated accurate and consistent performance across all 10 domains of orbital and oculofacial disease, with an average appropriateness score of 5.3/6.0 ("mostly appropriate" to "completely appropriate"). Domains of cavernous sinus fistula, retrobulbar hemorrhage, and blepharospasm had the highest domain scores (average scores of 5.5 to 5.6), while the proptosis domain had the lowest (average score of 5.0/6.0). The intraclass correlation coefficient was 0.64 (95% CI: 0.52 to 0.74), reflecting moderate inter-rater reliability. The responses exhibited a high reading-level complexity, representing the comprehension levels of a college or graduate education.
    CONCLUSIONS: This study demonstrates the potential of ChatGPT 4.0 to provide accurate information in the field of ophthalmology, specifically orbital and oculofacial disease. However, challenges remain in ensuring accurate and comprehensive responses across all disease domains. Future improvements should focus on refining the model's correctness and eventually expanding the scope to visual data interpretation. Our results highlight the vast potential for artificial intelligence in educational and clinical ophthalmology contexts.
    DOI:  https://doi.org/10.1097/IOP.0000000000002552
  8. Int J Impot Res. 2023 Nov 20.
      Erectile dysfunction (ED) is a disorder that can cause distress and shame for men suffering from it. Men with ED will often turn to online support and chat groups to ask intimate questions about their health. ChatGPT is an artificial intelligence (AI)-based software that has been trained to engage in conversation with human input. We sought to assess the accuracy, readability, and reproducibility of ChatGPT's responses to frequently asked questions regarding the diagnosis, management, and care of patients with ED. Questions pertaining to ED were derived from clinic encounters with patients as well as online chat forums. These were entered into the free ChatGPT version 3.5 during the month of August 2023. Questions were asked on two separate days from unique accounts and computers to prevent the software from memorizing responses linked to a specific user. A total of 35 questions were asked. Outcomes measured were accuracy using grading from board certified urologists, readability with the Gunning Fog Index, and reproducibility by comparing responses between days. For epidemiology of disease, the percentage of responses that were graded as "comprehensive" or "correct but inadequate" was 100% across both days. There was fair reproducibility and median readability of 15.9 (IQR 2.5). For treatment and prevention, the percentage of responses that were graded as "comprehensive" or "correct but inadequate" was 78.9%. There was poor reproducibility of responses with a median readability of 14.5 (IQR 4.0). Risks of treatment and counseling both had 100% of questions graded as "comprehensive" or "correct but inadequate." The readability score for risks of treatment was median 13.9 (IQR 1.1) and for counseling median 13.8 (IQR 0.5), with good reproducibility for both question domains. ChatGPT provides accurate answers to common patient questions pertaining to ED, although its understanding of treatment options is incomplete and responses are at a reading level too advanced for the average patient.
    DOI:  https://doi.org/10.1038/s41443-023-00797-z
  9. Nurse Educ. 2023 Nov 22.
      BACKGROUND: The research on ChatGPT-generated nursing care planning texts is critical for enhancing nursing education through innovative and accessible learning methods, improving reliability and quality.PURPOSE: The aim of the study was to examine the quality, authenticity, and reliability of the nursing care planning texts produced using ChatGPT.
    METHODS: The study sample comprised 40 texts generated by ChatGPT selected nursing diagnoses that were included in NANDA 2021-2023. The texts were evaluated by using a descriptive criteria form and the DISCERN tool to evaluate health information.
    RESULTS: DISCERN total average score of the texts was 45.93 ± 4.72. All texts had a moderate level of reliability and 97.5% of them provided moderate quality subscale score of information. A statistically significant relationship was found among the number of accessible references, reliability (r = 0.408), and quality subscale score (r = 0.379) of the texts (P < .05).
    CONCLUSION: ChatGPT-generated texts exhibited moderate reliability, quality of nursing care information, and overall quality despite low similarity rates.
    DOI:  https://doi.org/10.1097/NNE.0000000000001566
  10. Laryngoscope. 2023 Nov 20.
      OBJECTIVE: With burgeoning popularity of artificial intelligence-based chatbots, oropharyngeal cancer patients now have access to a novel source of medical information. Because chatbot information is not reviewed by experts, we sought to evaluate an artificial intelligence-based chatbot's oropharyngeal cancer-related information for accuracy.METHODS: Fifteen oropharyngeal cancer-related questions were developed and input into ChatGPT version 3.5. Four physician-graders independently assessed accuracy, comprehensiveness, and similarity to a physician response using 5-point Likert scales. Responses graded lower than three were then critiqued by physician-graders. Critiques were analyzed using inductive thematic analysis. Readability of responses was assessed using Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKRGL) scales.
    RESULTS: Average accuracy, comprehensiveness, and similarity to a physician response scores were 3.88 (SD = 0.99), 3.80 (SD = 1.14), and 3.67 (SD = 1.08), respectively. Posttreatment-related questions were most accurate, comprehensive, and similar to a physician response, followed by treatment-related, then diagnosis-related questions. Posttreatment-related questions scored significantly higher than diagnosis-related questions in all three domains (p < 0.01). Two themes of the physician critiques were identified: suboptimal education value and potential to misinform patients. The mean FRE and FKRGL scores both indicated greater than an 11th grade readability level-higher than the 6th grade level recommended for patients.
    CONCLUSION: ChatGPT responses may not educate patients to an appropriate degree, could outright misinform them, and read at a more difficult grade level than is recommended for patient material. As oropharyngeal cancer patients represent a vulnerable population facing complex, life-altering diagnoses, and treatments, they should be cautious when consuming chatbot-generated medical information.
    LEVEL OF EVIDENCE: N/A Laryngoscope, 2023.
    Keywords:  ChatGPT; artificial intelligence; communication; head and neck cancer
    DOI:  https://doi.org/10.1002/lary.31191
  11. World Neurosurg. 2023 Nov 22. pii: S1878-8750(23)01625-X. [Epub ahead of print]
      OBJECTIVE: This study aimed to assess the quality, readability, and comprehension of texts generated by ChatGPT in response to commonly asked questions about spinal cord injury (SCI).METHODS: The study utilized Google Trends to identify the most frequently searched keywords related to SCI. The identified keywords were sequentially inputted into ChatGPT, and the resulting responses were assessed for quality using the Ensuring Quality Information for Patients (EQIP) tool. The readability of the texts was analyzed using the Flesch-Kincaid Grade Level and the Flesch-Kincaid Reading Ease parameters.
    RESULTS: The mean EQIP score of the texts was determined to be 43.02 ± 6.37, the Flesch-Kincaid Reading Ease score to be 26.24 ± 13.81, and the Flesch-Kincaid Grade Level was to be 14.84 ± 1.79. The analysis revealed significant concerns regarding the quality of texts generated by ChatGPT, indicating serious problems with readability and comprehension. The mean EQIP score was low, suggesting a need for improvement in the accuracy and reliability of the information provided. The Flesch-Kincaid Grade Level indicated a high linguistic complexity, requiring a level of education equivalent to approximately 14 to 15 years of formal education for comprehension.
    CONCLUSION: The results of this study show heightened complexity in ChatGPT-generated SCI texts, surpassing optimal health communication readability. ChatGPT currently cannot substitute comprehensive medical consultations. Enhancing text quality could be attainable through dependence on credible sources, the establishment of a scientific board, and collaboration with expert teams. Addressing these concerns could improve text accessibility, empowering patients and facilitating informed decision-making in SCI.
    Keywords:  ChatGPT; Spinal cord injury; comprehension; quality assessment; readability
    DOI:  https://doi.org/10.1016/j.wneu.2023.11.062
  12. Eur J Obstet Gynecol Reprod Biol. 2023 Nov 19. pii: S0301-2115(23)00828-X. [Epub ahead of print]292 133-137
      OBJECTIVES: To review systematically the quality, readability and credibility of English language webpages offering patient information on fetal growth restriction.STUDY DESIGN: A systematic review of patient information was undertaken on Google with location services and browser history disabled. Websites from the first page were included providing they gave at least 300 words of health information on fetal growth restriction aimed at patients. Validated assessment of readability, credibility and quality were undertaken. An accuracy assessment was performed based on international guidance. Characteristics were tabulated.
    RESULTS: Thirty-one websites including 30 different texts were included. No pages had a reading age of 11 years or less, none were credible, and only one was of high quality. Median accuracy rating was 9/24.
    CONCLUSION: Patients cannot rely on Google as a source of information on fetal growth restriction. As well as being difficult to read, information tends to be low quality, low accuracy and not credible. Healthcare professionals must consider how to enable access to high-quality patient information and give time for discussion of information patients have found: failure to do so may disenfranchise patients.
    Keywords:  Antenatal care; Fetal growth restriction; Fetal medicine; Patient information
    DOI:  https://doi.org/10.1016/j.ejogrb.2023.11.022
  13. Public Health. 2023 Nov 16. pii: S0033-3506(23)00388-8. [Epub ahead of print]226 1-7
      OBJECTIVES: The purpose of this study is to evaluate the readability and quality of Internet information related to vocal health, voice disorders and voice therapy.STUDY DESIGN: This is a cross-sectional study.
    METHODS: Eighty-two websites were included. Websites were then analyzed; their origin (clinic/hospital, non-profit, government), quality (Health On the Net [HON] certification and DISCERN scores) and readability (Ateşman readability formula and Bezirci-Yılmaz new readability formula) were assessed. Statistical analysis was used to examine differences between website origin and quality and readability scores and correlations between readability instruments.
    RESULTS: Of the 82 websites, 93% were of private clinic/hospital, 6% were of non-profit organisation and 1% were of government. None of the 82 websites were HON certification, and the mean score of the item determining the general quality measure in DISCERN was 1.83 in a five-point scale. The mean of Ateşman readability formula value was calculated as 50.46 (±8.16). This value is defined as 'moderately hard' according to the readability scale. The average of Bezirci-Yılmaz new readability formula value is 13.85 (±3.48). This value is defined as 13th and 14th grade.
    CONCLUSIONS: The quality of Internet-based health information about the voice is generally inadequate, and the sites examined in this study may be limited due to high readability levels. This may be a problem in people with poor literacy skills. For this reason, it is very important for speech and language therapists and other health professionals to evaluate and monitor the quality and readability of Internet-based information.
    Keywords:  Health information; Internet-based; Readability; Vocal health
    DOI:  https://doi.org/10.1016/j.puhe.2023.10.020
  14. JMIR Med Educ. 2023 Nov 24. 9 e45372
      BACKGROUND: YouTube is considered one of the most popular sources of information among college students.OBJECTIVE: This study aimed to explore the use of YouTube as a pathology learning tool and its relationship with pathology scores among medical students at Jordanian public universities.
    METHODS: This cross-sectional, questionnaire-based study included second-year to sixth-year medical students from 6 schools of medicine in Jordan. The questionnaire was distributed among the students using social platforms over a period of 2 months extending from August 2022 to October 2022. The questionnaire included 6 attributes. The first section collected demographic data, and the second section investigated the general use of YouTube and recorded material. The remaining 4 sections targeted the participants who used YouTube to learn pathology including using YouTube for pathology-related content.
    RESULTS: As of October 2022, 699 students were enrolled in the study. More than 60% (422/699, 60.4%) of the participants were women, and approximately 50% (354/699, 50.6%) were second-year students. The results showed that 96.5% (675/699) of medical students in Jordan were using YouTube in general and 89.1% (623/699) were using it as a source of general information. YouTube use was associated with good and very good scores among the users. In addition, 82.3% (575/699) of medical students in Jordan used YouTube as a learning tool for pathology in particular. These students achieved high scores, with 428 of 699 (61.2%) students scoring above 70%. Most participants (484/699, 69.2%) reported that lectures on YouTube were more interesting than classic teaching and the lectures could enhance the quality of learning (533/699, 76.3%). Studying via YouTube videos was associated with higher odds (odds ratio [OR] 3.86, 95% CI 1.33-11.18) and lower odds (OR 0.27, 95% CI 0.09-0.8) of achieving higher scores in the central nervous system and peripheral nervous system courses, respectively. Watching pathology lectures on YouTube was related to a better chance of attaining higher scores (OR 1.96, 95% CI 1.08-3.57). Surprisingly, spending more time watching pathology videos on YouTube while studying for examinations corresponded with lower performance, with an OR of 0.46 (95% CI 0.26-0.82).
    CONCLUSIONS: YouTube may play a role in enhancing pathology learning, and aiding in understanding, memorization, recalling information, and obtaining higher scores. Many medical students in Jordan have positive attitudes toward using YouTube as a supplementary pathology learning tool. Based on this, it is recommended that pathology instructors should explore the use of YouTube and other emerging educational tools as potential supplementary learning resources.
    Keywords:  YouTube; medical education; medical students; online resources; pathology; social media
    DOI:  https://doi.org/10.2196/45372
  15. Breast Cancer. 2023 Nov 23.
      BACKGROUND: The internet, especially YouTube, has become a prominent source of health information. However, the quality and accuracy of medical content on YouTube vary, posing concerns about misinformation. This study focuses on providing reliable information about hereditary breast cancer on YouTube, given its importance for decision-making among patients and families. The study examines the quality and accessibility of such content in Japanese, where limited research has been conducted.METHODS: A nonprofit organization called BC Tube was established in May 2020 to create informative videos about breast cancer. The study analyzed 85 YouTube videos selected using the Japanese keywords "hereditary breast cancer" and "HBOC", categorized into six groups based on the source of upload: BC Tube, hospitals/governments, individual physicians, public-interest organizations/companies, breast cancer survivors, and others. The videos were evaluated based on various factors, including content length, view counts, likes, comments, and the presence of advertisements. The content was evaluated using the PEMAT and DISCERN quality criteria.
    RESULTS: BC Tube created high-quality videos with high scores on PEMAT understandability, significantly outperforming other sources. Videos from public-interest organizations/companies received the most views and likes, despite their lower quality. Videos from medical institutions and governments were of superior quality but attracted less attention.
    CONCLUSIONS: Our study emphasizes the importance of promoting accessible, easy-to-understand, and widely recognized medical information online. The popularity of videos does not always correspond to their quality, emphasizing the importance of quality evaluation. BC Tube provides a peer-reviewed platform to disseminate high-quality health information. We need to develop high-quality online health information and encourage the promotion of evidence-based information on YouTube.
    Keywords:  Hereditary breast cancer; Online health information; Patient and public involvement (PPI); Peer review; YouTube
    DOI:  https://doi.org/10.1007/s12282-023-01512-y
  16. Shoulder Elbow. 2023 Dec;15(6): 674-679
      Background: Ulnar collateral ligament reconstruction (UCLR) is commonly performed on adolescent athletes, who often turn to online sources such as YouTube for health information. The purpose of this study was to retrospectively review the accuracy, reliability, and quality of UCLR videos using validated scoring instruments.Methods: YouTube was queried for "Tommy John surgery," "UCL reconstruction," and "ulnar collateral ligament reconstruction." After categorization by physician, nonphysician/trainer, patient or commercial source, videos were assessed for reliability and quality using the Journal of the American Medical Association (JAMA) benchmark criteria (0-4) and DISCERN tool (16-80).
    Results: 104 videos were included in the final analysis. 74% of videos (77/104) were made by physicians. The mean JAMA and DISCERN scores for all videos were 3.1 ± 0.8 and 46.1 ± 8.5, respectively. The majority of videos were rated as "fair" based on DISCERN score (56/104, 53.8%). JAMA scores were significantly higher for physician videos compared to nonphysician videos (3.3 ± 0.8 vs 2.6 ± 0.7, p < 0.0001), but no such difference was found for DISCERN scores (46.3 ± 7.7 vs 45.3 ± 10.57, p = 0.43).
    Conclusion: Physicians should be cognizant of the quality and reliability of YouTube videos when instructing patients on information sources related to UCLR.
    Keywords:  Tommy John surgery; UCL; YouTube; ulnar collateral ligament; ulnar collateral ligament reconstruction
    DOI:  https://doi.org/10.1177/17585732221129590
  17. Health Policy. 2023 Nov 08. pii: S0168-8510(23)00227-0. [Epub ahead of print]138 104942
      We use recently released data from the Survey of Health Ageing and Retirement in Europe (SHARE) to investigate the role of online health information seeking on Covid-19 vaccine hesitancy, which is defined as the reluctance or refusal to receive vaccinations despite the availability of vaccines. We adopt an instrumental variable strategy that exploits the computerization of workplaces occurred in the last century to deal with endogeneity. We find that searching for health information strongly reduces vaccine hesitancy. Results also show that individuals whose social networks suffered more during the outbreak, in terms of hospitalisations and deaths, are less likely to be hesitant. Improving individuals' technological skills might have positive spill-over effects for public health.
    Keywords:  Covid-19; Internet; Vaccine hesitancy
    DOI:  https://doi.org/10.1016/j.healthpol.2023.104942
  18. Mol Cell Proteomics. 2023 Nov 20. pii: S1535-9476(23)00193-7. [Epub ahead of print] 100682
      Global phosphoproteomics experiments generate data on tens of thousands of phosphorylation sites. However, interpretation of phosphoproteomic findings is hampered by our limited knowledge on functions, biological contexts, or precipitating enzymes of the phosphosites. This study aims to build a repository of phosphosites with associated evidence in biomedical abstracts by applying deep learning-based natural language processing techniques (NLP). Our model for illuminating the dark phosphoproteome through PubMed mining (IDPpub) was generated by fine-tuning BioBERT using sentences containing protein substrates and phosphorylation site positions from 3000 abstracts. The IDPpub model was then used to extract phosphorylation sites from all abstracts in MEDLINE. The extracted proteins were normalized to gene symbols using NCBI gene query and sites were mapped to human UniProt sequences using ProtMapper and mouse UniProt sequences by direct match. Precision and recall were calculated using 150 curated abstracts and utility was assessed by analyzing the CPTAC pan-cancer phosphoproteomics datasets and the PhosphoSitePlus database. Using 10-fold cross validation, pairs of correct substrates and phosphosite positions were extracted with an average precision of 0.93 and recall of 0.94. After entity normalization and site mapping to human reference sequences, an independent validation achieved a precision of 0.91 and recall of 0.77. Overall, the current version of the IDPpub repository contains 18,458 unique phosphorylation sites with evidence sentences from 58,227 abstracts for human sites and 5,918 sites in 14,610 abstracts for mouse. This included evidence sentences for 1,803 sites identified in CPTAC studies that are not covered by manually curated functional information in PhosphoSitePlus. Evaluation results demonstrate the potential of IDPpub as an effective text mining tool for collecting phosphosites from the biomedical literature. Moreover, the repository (https://www.zhang-lab.org/idppub/), which can be automatically updated, can serve as a powerful complement to existing resources.
    DOI:  https://doi.org/10.1016/j.mcpro.2023.100682
  19. Proc Conf Assoc Comput Linguist Meet. 2023 May;2023 2339-2349
      The dissemination of false information on the internet has received considerable attention over the last decade. Misinformation often spreads faster than mainstream news, thus making manual fact checking inefficient or, at best, labor-intensive. Therefore, there is an increasing need to develop methods for automatic detection of misinformation. Although resources for creating such methods are available in English, other languages are often underrepresented in this effort. With this contribution, we present IRMA, a corpus containing over 600,000 Italian news articles (335+ million tokens) collected from 56 websites classified as 'untrustworthy' by professional factcheckers. The corpus is freely available and comprises a rich set of text- and website-level data, representing a turnkey resource to test hypotheses and develop automatic detection algorithms. It contains texts, titles, and dates (from 2004 to 2022), along with three types of semantic measures (i.e., keywords, topics at three different resolutions, and LIWC lexical features). IRMA also includes domainspecific information such as source type (e.g., political, health, conspiracy, etc.), quality, and higher-level metadata, including several metrics of website incoming traffic that allow to investigate user online behavior. IRMA constitutes the largest corpus of misinformation available today in Italian, making it a valid tool for advancing quantitative research on untrustworthy news detection and ultimately helping limit the spread of misinformation.