bims-skolko Biomed News
on Scholarly communication
Issue of 2025–06–15
29 papers selected by
Thomas Krichel, Open Library Society



  1. Res Integr Peer Rev. 2025 Jun 11. 10(1): 9
       BACKGROUND: Retractions undermine the scientific record's reliability and can lead to the continued propagation of flawed research. This study aimed to (1) create a dataset aggregating retraction information with bibliographic metadata, (2) train and evaluate various machine learning approaches to predict article retractions, and (3) assess each feature's contribution to feature-based classifier performance using ablation studies.
    METHODS: An open-access dataset was developed by combining information from the Retraction Watch database and the OpenAlex API. Using a case-controlled design, retracted research articles were paired with non-retracted articles published in the same period. Traditional feature-based classifiers and models leveraging contextual language representations were then trained and evaluated. Model performance was assessed using accuracy, precision, recall, and the F1-score.
    RESULTS: The Llama 3.2 base model achieved the highest overall accuracy. The Random Forest classifier achieved a precision of 0.687 for identifying non-retracted articles, while the Llama 3.2 base model reached a precision of 0.683 for identifying retracted articles. Traditional feature-based classifiers generally outperformed most contextual language models, except for the Llama 3.2 base model, which showed competitive performance across several metrics.
    CONCLUSIONS: Although no single model excelled across all metrics, our findings indicate that machine learning techniques can effectively support the identification of retracted research. These results provide a foundation for developing automated tools to assist publishers and reviewers in detecting potentially problematic publications. Further research should focus on refining these models and investigating additional features to improve predictive performance.
    TRIAL REGISTRATION: Not applicable.
    Keywords:  Machine learning; Retraction prediction; Scientific publishing
    DOI:  https://doi.org/10.1186/s41073-025-00168-w
  2. Cureus. 2025 May;17(5): e83924
      Background and aim Artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT (San Francisco, CA: OpenAI) and Bard (now Gemini) (Mountain View, CA: Google), is increasingly used in scientific writing. However, its rapid adoption has raised ethical concerns, especially regarding plagiarism. Current standards of plagiarism detection for scientific writing require a costly and laborious process, done by journal reviewers, sparking questions of whether free LLM tools could streamline the process. Further, AI-generated text can closely resemble genuine scientific writing, raising questions about authenticity and detection. Although various tools exist to identify AI-generated content, their effectiveness remains uncertain. The objective of this study is to evaluate the ability of free online AI tools to identify plagiarism in scientific papers. Methods The following three topics were used for this study: 3D evaluation of the anterior mandible, low-dose cone beam CT, and ameloblastoma case reports. Plagiarized "mashup" papers were created by combining paragraphs from published papers - two papers on 3D evaluation, three on low-dose CBCT, and two on ameloblastoma, each with six paragraphs. These mashups were tested for plagiarism using ChatGPT 3.5 and Bard (five times each) and SmallSEO (London, UK: SmallSEO Tools) (three times). ChatGPT and Bard were then prompted to rewrite the plagiarized mashups, and the rewrites were retested for plagiarism and evaluated for AI detection. Results ChatGPT was unable to identify plagiarism (0/15). Bard detected plagiarism in 8/15 trials, but never identified all plagiarized text. SmallSEO identified 100% of the plagiarism and correctly sourced it, but after AI rewrites, SmallSEO missed 87/90 plagiarized paragraphs. Neither AI-detection tool could definitively detect AI-generated rewrites, with likelihoods never exceeding 70%. ChatGPT and Bard were unable to reliably detect plagiarism. AI-rewritten content was undetectable by plagiarism checkers, and AI-detection tools could not definitively identify AI-generated text. Conclusion ChatGPT, Bard, and SmallSEO are currently unable to identify plagiarism in scientific text. Further, these generative AI tools are capable of rewriting plagiarized text to evade plagiarism detection. Finally, AI-detection tools cannot reliably detect the use of AI in AI-rewritten text.
    Keywords:  artificial intelligence; artificial intelligence in scientific writing; chatgpt; plagiarism; scientific writing and artificial intelligence
    DOI:  https://doi.org/10.7759/cureus.83924
  3. Eur J Radiol. 2025 Jun 06. pii: S0720-048X(25)00309-2. [Epub ahead of print]190 112223
       PURPOSE: To investigate the experience of radiology researchers with medical image falsification in the scientific literature.
    METHODS: Corresponding authors of articles published in the top 12 general radiology journals in 2024 were invited to take part in a survey regarding medical image falsification in the scientific literature.
    RESULTS: A total of 310 corresponding authors participated in this survey. Thirty-seven participants (11.9 %) reported having committed some form of medical image falsification in the past five years, while 115 participants (37.1 %) reported witnessing such falsification by colleagues during the same period. Cherry-picking images to support conclusions (i.e. selectively choosing specific, nonrepresentative images that confirm a desired result or argument) was the most common type of medical image falsification (50.3 %), followed by duplicating or reusing images without formal permission (24.9 %), and enhancing images in such a way that in results in the misrepresentation of data or findings (13.7 %). Being female was significantly associated with lower odds of committing medical image falsification in the past five years compared to being male (odds ratio (OR): 0.154, 95 % confidence interval (CI): 0.045-0.531; P = 0.003). Similarly, researchers without a medical doctor (MD) degree were less likely to have committed falsification than those with an MD degree (OR: 0.286, 95 % CI: 0.096-0.848; P = 0.024).
    CONCLUSION: A substantial share of radiology researchers has engaged in falsifying medical images in the scientific literature over the last five years. Female gender and not holding an MD degree were both significantly associated with lower odds of committing medical image falsification.
    Keywords:  Pressure; Publications; Radiology; Scientific misconduct
    DOI:  https://doi.org/10.1016/j.ejrad.2025.112223
  4. Science. 2025 Jun 12. 388(6752): 1121-1122
      Driven largely by open access, the trend puts society programming at risk.
    DOI:  https://doi.org/10.1126/science.adz6859
  5. Nature. 2025 Jun 06.
      
    Keywords:  Institutions; Publishing; Scientific community
    DOI:  https://doi.org/10.1038/d41586-025-01727-3
  6. Nature. 2025 Jun 10.
      
    Keywords:  Peer review; Publishing; Scientific community
    DOI:  https://doi.org/10.1038/d41586-025-01826-1
  7. Nature. 2025 Jun 09.
      
    Keywords:  Careers; Publishing; Research management
    DOI:  https://doi.org/10.1038/d41586-025-01824-3
  8. Lancet. 2025 Jun 07. pii: S0140-6736(25)00821-9. [Epub ahead of print]405(10494): 2046-2047
      
    DOI:  https://doi.org/10.1016/S0140-6736(25)00821-9
  9. Plast Reconstr Surg. 2025 Jun 10.
       BACKGROUND: The growing use of artificial intelligence (AI) in academic writing raises concerns about the integrity of scientific manuscripts and the ability to accurately distinguish human-written from AI-generated content. This study evaluates the ability of medical professionals and AI-detection tools to identify AI involvement in plastic surgery manuscripts.
    METHODS: Eight manuscript passages across four topics were assessed, with four on plastic surgery. Passages were human-written, human-written with AI edits, or fully AI-generated. Twenty-four raters, including medical students, residents, and attendings, classified the passages by origin. Interrater reliability was measured using Fleiss' kappa. Human-written and AI-generated manuscripts were analyzed using three different online AI detection tools. A receiver operator curve (ROC) analysis was conducted to assess their accuracy in detecting AI-generated content. Intraclass correlation coefficients (ICC) were calculated to assess the agreement among the detection tools; these tools identify AI-generated content within the passages in terms of percentage generated by AI.
    RESULTS: Raters correctly identified the origin of passages 26.5% of the time. For AI-generated passages, accuracy was 34.4%, and for human-written passages, 14.5% (p=0.012). Interrater reliability was poor (kappa=0.078). AI detection tools showed strong discriminatory power (AUC=0.962), but false-positives were frequent at optimal cutoffs (25%-50%). The intraclass correlation coefficient (ICC) between tools was low (-0.118).
    CONCLUSIONS: Medical professionals and AI detection tools struggle to reliably identify AI-generated content. While AI tools demonstrated high discriminatory power, they often misclassified human-written passages. These findings highlight the need for improved methods to protect the integrity of scientific writing and prevent false plagiarism claims.
    DOI:  https://doi.org/10.1097/PRS.0000000000012242
  10. Ir J Med Sci. 2025 Jun 12.
       BACKGROUND: There is huge interest in the use of artificial intelligence (AI) in the production and assessment of academic material; however, the role of AI remains unclear.
    AIM: The purpose of this study was to perform a reviewer-blinded assessment of the quality of scientific discussion generated by an advanced AI language model (ChatGPT-4, Open AI) and determine whether this could be recommended for high-impact journal publication.
    METHODS: The introduction, methods and results sections of a recently published article from a high-impact journal were input into a current AI model. The AI application then produced a discussion and conclusion based on the provided text using a standardized prompt. Six experienced blinded reviewers scored all five sections of the hybrid article. A one-way analysis of variance (ANOVA) was used to assess significant differences between scores of each section. Reviewers recommended a decision regarding the suitability of the article for publication.
    RESULTS: AI composed a scientific discussion and conclusion. The median score was 80 (IQR 70-90) for introduction, 77.5 (IQR 70-90) for methods, 82.5 (IQR 50-90) for results, 60 (IQR 40-75) for discussion and 60 (IQR 40-80) for the conclusion. The median scores for the AI-generated sections were non-significantly lower than other sections (p = 0.37). The majority of reviewers (5/6, 83%) recommended "acceptance for publication after major revision". One reviewer recommended "resubmission with no guarantee of acceptance". There were no recommendations for rejection.
    CONCLUSION: Current AI large language models are now capable of generating content that passes experienced peer review and is acceptable for publication in a high-impact orthopaedic journal, after revision. There are still many concerns regarding the integration of AI into the process of scientific writing, mainly the tendency of AI to rely on advanced pattern recognition and fabricated or inadequate references.
    LEVEL OF EVIDENCE: Level IV.
    Keywords:  AI; Artificial intelligence; Author; Chat GPT; Large language model; Peer review
    DOI:  https://doi.org/10.1007/s11845-025-03971-y
  11. J Am Acad Psychiatry Law. 2025 Jun 10. 53(2): 136-139
      
    Keywords:  academic writing; artificial intelligence; ethics; forensic psychiatry; professionalism; publication
    DOI:  https://doi.org/10.29158/JAAPL.250020-25
  12. J Pharm Bioallied Sci. 2025 May;17(Suppl 1): S66-S69
      Navigating the intricacies of writing a peer-reviewed medical article demands meticulous adherence to guidelines and an acute understanding of the publication process. This discourse provides a comprehensive overview, emphasizing the crucial components requisite for a successful publication. Delving into the structuring nuances, it delineates the essential elements spanning from title formulation to conclusion drafting. The significance of adherence to journal-specific guidelines, avoidance of plagiarism, and recognition of impact factors are underscored. Furthermore, it elucidates the distinct sections of a medical article, including Introduction, Materials and Methods, Results, and Discussion, elucidating their roles in conveying research findings effectively. Notably, it expounds on the diverse types of health science articles, ranging from original research to systematic reviews, delineating their unique characteristics. Moreover, insights into indexing criteria and impact assessment metrics, such as h-index and journal impact factor, are provided, elucidating their implications for publication success. Finally, common reasons for article rejection and strategies for enhancing publication prospects are discussed, underscoring the evolving landscape of evidence-based medical writing.
    Keywords:  Article writing; journal publications; medical publications; peer-reviewed journal
    DOI:  https://doi.org/10.4103/jpbs.jpbs_651_24
  13. Health Psychol Rev. 2025 Jun 13. 1-18
      Scientific journals play a crucial role in promoting open science. The Transparency and Openness Promotion (TOP) guidelines identify a range of standards that journals can adopt to promote the verifiability of the research they publish. We evaluated the adoption of TOP standards within health psychology and behavioural medicine journal policies, as this had not yet been systematically assessed. In a cross-sectional study on 19 health psychology and behavioural medicine journals, eight raters evaluated TOP standard adoption by these journals using the TRUST journal policy evaluation tool. Out of a total possible score of 29, journal scores ranged from 1 to 13 (median = 6). Standards related to use of reporting guidelines and data transparency were adopted the most, whereas standards related to pre-registration of study analysis plans and citation of code were adopted the least. TOP guidelines have to-date been poorly adopted within health psychology and behavioural medicine journal policies. There are several relatively straightforward opportunities for improvement, such as expanding policies around research data to also consider code and materials, and reducing ambiguity of wording. However, other improvements may require a collaborative approach involving all research stakeholders.
    Keywords:  Open science; TOP factor; journal policies; publishing guidelines; transparency; verifiability
    DOI:  https://doi.org/10.1080/17437199.2025.2516010
  14. Risk Anal. 2025 Jun 06.
      Scientists, publishers, and journal editors are wondering how, whether, and to what extent artificial intelligence (AI) tools might soon help to advance the rigor, efficiency, and value of scientific peer review. Will AI provide timely, useful feedback that helps authors improve their manuscripts while avoiding the biases and inconsistencies of human reviewers? Or might it instead generate low-quality verbiage, add noise and errors, reinforce flawed reasoning, and erode trust in the review process? This perspective reports on evaluations of two experimental AI systems: (i) a "Screener" available at http://screener.riskanalysis.cloud/ that gives authors feedback on whether a draft paper (or abstract, proposal, etc.) appears to be a fit for the journal Risk Analysis, based on the guidance to authors provided by the journal (https://www.sra.org/journal/what-makes-a-good-risk-analysis-article/); and (ii) a more ambitious "Reviewer" (http://aia1.moirai-solutions.com/) that gives substantive technical feedback and recommends how to improve the clarity of methodology and the interpretation of results. The evaluations were conducted by a convenience sample of Risk Analysis Area Editors (AEs) and authors, including two authors of manuscripts in progress and four authors of papers that had already been published. The Screener was generally rated as useful. It has been deployed at Risk Analysis since January of 2025. On the other hand, the Reviewer had mixed ratings, ranging from strongly positive to strongly negative. This perspective describes both the lessons learned and potential next steps in making AI tools useful to authors prior to peer review by human experts.
    Keywords:  artificial intelligence; augmented intelligence; human–machine interaction; large language models (LLMs); peer review; risk analysis; scientific journals; scientific publishing
    DOI:  https://doi.org/10.1111/risa.70055
  15. AMIA Jt Summits Transl Sci Proc. 2025 ;2025 177-186
      As genomic research continues to advance, sharing of genomic data and research outcomes has become increasingly important for fostering collaboration and accelerating scientific discovery. However, such data sharing must be balanced with the need to protect the privacy of individuals whose genetic information is being utilized. This paper presents a bidirectional framework for evaluating privacy risks associated with data shared (both in terms of summary statistics and research datasets) in genomic research papers, particularly focusing on re-identification risks such as membership inference attacks (MIA). The framework consists of a structured workflow that begins with a questionnaire designed to capture researchers' (authors') self-reported data sharing practices and privacy protection measures. Responses are used to calculate the risk of re-identification for their study (paper) when compared with the National Institutes of Health (NIH) genomic data sharing policy. Any gaps in compliance help us to identify potential vulnerabilities and encourage the researchers to enhance their privacy measures before submitting their research for publication. The paper also demonstrates the application of this framework, using published genomic research as case study scenarios to emphasize the importance of implementing bidirectional frameworks to support trustworthy open science and genomic data sharing practices.
  16. Nature. 2025 Jun 09.
      
    Keywords:  Anthropology; Careers; Geography; Peer review; Publishing
    DOI:  https://doi.org/10.1038/d41586-025-01606-x
  17. Turk Arch Otorhinolaryngol. 2025 Jun 10.
       Objective: Principles of transparency and best practice in scholarly publishing is one of the important standards for the functioning and publishing quality of peer-reviewed scientific journals. The aim of this study is to evaluate Turkish otorhinolaryngology, head and neck surgery (ORL-HNS) journals according to these principles and to point out the areas that need improvement.
    Methods: This descriptive study is based on the evaluation of website contents of eight Turkish ORL-HNS journals according to the 16 principles of transparency criteria. The number of scientific papers published in 2020 and 2021 were retrieved from the respective websites of the journals. The impact factors were calculated by analyzing the citations in 2022 via Google Scholar. The probable relationship between impact factor and compliance with transparency principles was investigated. Impact factor and transparency principles were studied to draw attention to the international standards which can contribute to journals for international scholarly publishing.
    Results: The journals highly comply with website publishing, ethics, access, and ownership criteria; however, most of them do not comply with advisory council, advertising, other income, and business practices criteria. While the first three journals with the highest impact factors comply with 12 to 14 of the 16 criteria, the last three comply with five to 12.
    Conclusion: The journals with high transparency criteria scores and high impact factors suggest that these criteria are important in terms of the reliability and validity of the information, and citation. Moreover, the websites of Turkish scientific ORL-HNS journals were seen to need improvement according to the transparency criteria, especially regarding financial issues such as business, financial status, and advertising.
    Keywords:  Otorhinolaryngology; best practice; journal article; journal impact factor; peer review; scholarly publishing
    DOI:  https://doi.org/10.4274/tao.tao.2025.2024-5-4
  18. PLoS Biol. 2025 Jun 10. 23(6): e3003230
      The era of artificial scientific intelligence is here. As algorithms generate discoveries at scale, what role remains for human scientists?
    DOI:  https://doi.org/10.1371/journal.pbio.3003230
  19. Br J Psychiatry. 2025 Jun 09. 1-5
      Recent changes to US research funding are having far-reaching consequences that imperil the integrity of science and the provision of care to vulnerable populations. Resisting these changes, the BJPsych Portfolio reaffirms its commitment to publishing mental science and advancing psychiatric knowledge that improves the mental health of one and all.
    Keywords:  Ethics; integrity; psychiatry; publishing; science
    DOI:  https://doi.org/10.1192/bjp.2025.118