bims-aukdir 2025-09-21 papers

bims-aukdir

Biomed News

on Automated knowledge discovery in diabetes research

Issue of 2025–09–21
eleven papers selected by
Mott Given

Unfolding the diagnostic pipeline of diabetic retinopathy with artificial intelligence: A systematic review.
OcuViT: A Vision Transformer-Based Approach for Automated Diabetic Retinopathy and AMD Classification.
A machine learning algorithm for the prediction of complications incorporated in electronic medical records improves type 2 diabetes care.
Explainable cluster-based learning for prediction of postprandial glycemic events and insulin dose optimization in type 1 diabetes.
Correction: Engineering novel features for diabetes complication prediction using synthetic electronic health records.
Interpretable Machine Learning for Predicting Adverse Pregnancy Outcomes in Gestational Diabetes: Retrospective Cohort Study.
Diagnostic Performance of Machine Learning Algorithms for Predicting Heart Failure in Diabetic Patients: A Systematic Review and Meta-Analysis.
Heterogeneous Covariates-Aware Pseudo Supervised Meta-Learning for Few-shot Diabetes Classification.
Accessible healing phase classification of diabetic foot ulcer.
IoT-based Approach for Diabetes Patient Monitoring Using Machine Learning.
Investigating the metabolic reprogramming mechanisms in diabetic nephropathy: a comprehensive analysis using bioinformatics and machine learning.

Surv Ophthalmol. 2025 Sep 17. pii: S0039-6257(25)00170-5. [Epub ahead of print]

Unfolding the diagnostic pipeline of diabetic retinopathy with artificial intelligence: A systematic review.

K Suganya Devi, Hemanth Kumar Vasireddi, Gnv Raja Reddy, Satish Kumar Satti.

  Diabetic retinopathy (DR) is a leading cause of vision impairment globally, necessitating early and accurate detection through effective screening methods. We focus on the integration of artificial intelligence (AI) techniques in automating and enhancing DR diagnosis. Timely detection and classification of DR severity are critical for patient management and intervention. AI-driven DR classification frameworks typically consist of sequential stages: image preprocessing, optic disc (OD) localization and removal, blood vessel segmentation, feature extraction, and classification of DR severity. In the proposed and implemented model, each of these phases was systematically addressed to ensure improved performance. The implementation demonstrated superior accuracy, achieving 98.02 % on the widely used MESSIDOR dataset. The pipeline incorporated effective preprocessing to enhance image quality, accurate OD localization and exclusion to avoid false detections, followed by precise vessel segmentation. Extracted features were then used to train deep learning models for DR severity classification. Comparative analysis with existing methods executed on the same dataset revealed that proposed model outperformed other state-of-the-art techniques in terms of classification accuracy and robustness. Ww outline the recent progress in AI-based DR screening, highlighting the significance of each diagnostic phase and their role in improving overall performance. By evaluating multiple approaches and benchmarking them against established dataset, the study emphasizes the transformative role of AI in DR diagnosis. Despite current challenges, AI holds substantial promise in clinical application, offering scalable, accurate, and efficient DR screening solutions that may significantly reduce the risk of blindness in diabetic patients.

Keywords:  Deep neural networks; Diabetic retinopathy; Fundus imaging; Image processing; Intelligent screening systems; Machine learning

DOI:  https://doi.org/10.1016/j.survophthal.2025.09.008
J Imaging Inform Med. 2025 Sep 19.

OcuViT: A Vision Transformer-Based Approach for Automated Diabetic Retinopathy and AMD Classification.

Faisal Ahmed, M D Joshem Uddin.

  Early detection and accurate classification of retinal diseases, such as diabetic retinopathy (DR) and age-related macular degeneration (AMD), are essential to preventing vision loss and improving patient outcomes. Traditional methods for analyzing retinal fundus images are often manual, prolonged, and rely on the expertise of the clinician, leading to delays in diagnosis and treatment. Recent advances in machine learning, particularly deep learning, have introduced automated systems to assist in retinal disease detection; however, challenges such as computational inefficiency and robustness still remain. This paper proposes a novel approach that utilizes vision transformers (ViT) through transfer learning to address challenges in ophthalmic diagnostics. Using a pre-trained ViT-Base-Patch16-224 model, we fine-tune it for diabetic retinopathy (DR) and age-related macular degeneration (AMD) classification tasks. To adapt the model for retinal fundus images, we implement a streamlined preprocessing pipeline that converts the images into PyTorch tensors and standardizes them, ensuring compatibility with the ViT architecture and improving model performance. We validated our model, OcuViT, on two datasets. We used the APTOS dataset to perform binary and five-level severity classification and the IChallenge-AMD dataset for grading age-related macular degeneration (AMD). In the five-class DR and AMD grading tasks, OcuViT outperforms all existing CNN- and ViT-based methods across multiple metrics, achieving superior accuracy and robustness. For the binary DR task, it delivers highly competitive performance. These results demonstrate that OcuViT effectively leverages ViT-based transfer learning with an efficient preprocessing pipeline, significantly improving the precision and reliability of automated ophthalmic diagnosis.

Keywords:  DR-AMD grading; Ophthalmology; Retinal disease diagnosis; Vision transformer

DOI:  https://doi.org/10.1007/s10278-025-01676-3
Diabetes Res Clin Pract. 2025 Sep 13. pii: S0168-8227(25)00914-3. [Epub ahead of print] 112900

A machine learning algorithm for the prediction of complications incorporated in electronic medical records improves type 2 diabetes care.

EGOAL - AMD Annals Study Group

   AIMS: Early identification of patients with type 2 diabetes (T2D) at high risk for complications may help reduce clinical inertia and improve care quality. This study assessed the clinical impact of integrating a machine learning-based prediction tool into electronic medical records (EMRs) in Italian diabetes clinics.
METHODS: A validated algorithm estimating the 5-year risk of six major diabetes complications was embedded in the EMRs of 38 centers. A pre-post comparison over 12 months was conducted between patients whose risk score was generated (test group) and those eligible but not assessed (control group).
RESULTS: Among 138,558 eligible patients, 20,314 (14.7 %) had at least one score generated. Compared to controls, test group patients showed significantly greater improvements in HbA1c ≤7.0 % (+9.0 % vs. +4.5 %), LDL-C <70 mg/dL (+27.9 % vs. +20.7 %), and BMI <25 kg/m2 (+16.5 % vs. +11.0 %), with larger reductions in HbA1c >8.0 % (-18.4 % vs. -10.1 %). They also more frequently initiated antihypertensive, lipid-lowering, and cardio-renal protective therapies.
CONCLUSIONS: Embedding an AI-based prediction tool in routine clinical practice improved several quality indicators and therapeutic decisions. Its real-world application shows promise in overcoming clinical inertia and promoting personalized diabetes management.

Keywords:  Artificial intelligence; Chronic complications; Electronic patient records; Type 2 diabetes

DOI:  https://doi.org/10.1016/j.diabres.2025.112900
PLOS Digit Health. 2025 Sep;4(9): e0000996

Explainable cluster-based learning for prediction of postprandial glycemic events and insulin dose optimization in type 1 diabetes.

Najib Ur Rehman, Ivan Contreras, Aleix Beneyto, Josep Vehi.

Effective management of postprandial glycemic excursions in type 1 diabetes requires accurate prediction of adverse events and personalized insulin adjustments informed by interpretable models. This study presents an explainable dual-prediction framework that simultaneously forecasts postprandial hypoglycemia and hyperglycemia within a 4-hour window using cluster-personalized ensemble models. Glycemic profiles were identified through a hybrid unsupervised approach combining self-organizing maps and k-means clustering, enabling the training of specialized random forest classifiers. The system outperformed baseline models on both real-world and simulated datasets, achieving high performance (AUC = 0.84 and 0.93; MCC = 0.47 and 0.73 for hypo- and hyperglycemia, respectively). Model interpretability was addressed using global (SHAP) and local (LIME) explanations, while interaction analysis revealed the non-linear effects of carbohydrate intake and insulin bolus combinations. An insulin adjustment module further refined pre-meal bolus recommendations based on predicted risk. Simulated evaluations confirmed improved postprandial time-in-range and reduced hypoglycemia without excessive hyperglycemia. These results underscore the potential of profile-driven and explainable machine learning approaches to support safer, individualized diabetes care.

DOI: https://doi.org/10.1371/journal.pdig.0000996
Front Genet. 2025 ;16 1687832

Correction: Engineering novel features for diabetes complication prediction using synthetic electronic health records.

Daniel Voskergian, Burcu Bakir-Gungor, Malik Yousef.

  [This corrects the article DOI: 10.3389/fgene.2025.1451290.].

Keywords:  diabetes complications; feature engineering; feature selection; machine learning; predictive modeling; risk prediction; synthetic electronic health records (EHRs)

DOI:  https://doi.org/10.3389/fgene.2025.1687832
JMIR Med Inform. 2025 Sep 16. 13 e71539

Interpretable Machine Learning for Predicting Adverse Pregnancy Outcomes in Gestational Diabetes: Retrospective Cohort Study.

Jiaxi Li, Xiali Liu, Shenyang He, Yan Ren.

   Background: Gestational diabetes mellitus (GDM) affects over 5% of pregnancies worldwide, elevating risks of type 2 diabetes post partum and complications such as fetal death, miscarriage, and congenital abnormalities. Effective GDM management is essential to balance glycemic control and pregnancy outcomes.
Objective: We aim to develop interpretable machine learning models using GDM datasets for predicting adverse pregnancy outcomes and identifying key factors through the Shapley additive explanations (SHAP) algorithm, thus supporting improved maternal and infant health.
Methods: Data preprocessing and feature selection were performed, with adaptive synthetic sampling used to address class imbalance. Classification models, including logistic regression, random forest, support vector machine, and extreme gradient boosting, were built and enhanced through the stacking method. Model interpretability was assessed with SHAP to quantify feature contributions.
Results: Among 1670 patients, 200 experienced adverse outcomes. The stacking model outperformed individual models, achieving an accuracy of 85.6%, a sensitivity of 57.8%, a specificity of 95.9%, and an area under the receiver operating characteristic curve of 0.82 on the test set. External validation on 159 patients showed a decline in performance (accuracy 83.6%, area under the receiver operating characteristic curve 0.67). SHAP analysis identified gestational age, glucose control, and diagnosis time among the most influential predictors, providing clinically meaningful insights into risk factors. Additionally, detailed SHAP-based visualization revealed the distribution of different feature values and their nonlinear impact on outcomes, as well as interaction effects between features. These interpretable analyses enabled a deeper understanding of individual and combined feature contributions, thereby enhancing clinical assessment capabilities.
Conclusions: This study underscores the potential of machine learning in predicting adverse outcomes in GDM, with interpretable features offering valuable clinical insights to enhance pregnancy management and maternal-infant health.

Keywords:  adverse pregnancy outcomes; ensemble learning; gestational diabetes mellitus; interpretable model; machine learning; risk prediction

DOI:  https://doi.org/10.2196/71539
Endocrinol Diabetes Metab. 2025 Sep;8(5): e70111

Diagnostic Performance of Machine Learning Algorithms for Predicting Heart Failure in Diabetic Patients: A Systematic Review and Meta-Analysis.

Pooya Eini, Peyman Eini, Homa Serpoush, Mohammad Rezayee.

   BACKGROUND: Heart failure is a significant complication in diabetic patients, and machine learning algorithms offer potential for early prediction. This systematic review and meta-analysis evaluated the diagnostic performance of ML models in predicting HF among diabetic patients.
METHODS: We searched PubMed, Web of Science, Embase, ProQuest, and Scopus, identifying 2830 articles. After deduplication and screening, 16 studies were included, with 7 providing data for meta-analysis. Study quality was assessed using PROBAST+AI. A bivariate random-effects model (Stata, midas, metadta) pooled sensitivity, specificity, likelihood ratios, and diagnostic odds ratio (DOR) for best-performing algorithms, with subgroup analyses. Heterogeneity (I2) and publication bias were assessed.
RESULTS: This meta-analysis of seven studies evaluating machine learning models for heart failure detection demonstrated a pooled sensitivity of 84% (95% CI: 0.75-0.90), specificity of 86% (95% CI: 0.56-0.97), and an area under the ROC curve of 0.90 (95% CI: 0.87-0.93). The pooled positive likelihood ratio was 6.6 (95% CI: 1.2-35.9), and the negative likelihood ratio was 0.17 (95% CI: 0.08-0.36), with a diagnostic odds ratio of 39 (95% CI: 4-423). Significant heterogeneity was observed, primarily related to differences in study populations, machine learning algorithms, dataset sizes, and validation methods. No significant publication bias was detected.
CONCLUSION: Machine learning models demonstrate promising diagnostic accuracy for heart failure detection and have the potential to support early diagnosis and risk assessment in clinical practice. However, considerable heterogeneity across studies and limited external validation highlight the need for standardised development, prospective validation, and improved interpretability of ML models to ensure their effective integration into healthcare systems.

Keywords:  diabetes; diagnostic accuracy; heart failure; machine learning; predictive modelling

DOI:  https://doi.org/10.1002/edm2.70111
IEEE Trans Comput Biol Bioinform. 2025 Sep 16. PP

Heterogeneous Covariates-Aware Pseudo Supervised Meta-Learning for Few-shot Diabetes Classification.

Lei Wang, Wei Liu, Deheng Cai, Linong Ji, Dawei Shi, Ke Yao, Qian Yang.

OBJECTIVE: The limited labeled data hinders the application of medical artificial intelligence technology in the field of diabetes classification. In this paper, a pseudo-label supervised meta-learning algorithm supported by heterogeneous covariates data is proposed to implement diabetes classification tasks with fewer labeled samples.
METHODS: First, clustering algorithms are employed to generate pseudo labels of samples, which are further used to create multiple pseudo-supervised tasks for meta-learning within the framework of few-shot learning. Second, the time and date features of dynamically monitored glucose data are extracted as dynamic covariates, while the physiological indicators from medical single sampling serve as static covariates. By incorporating these heterogeneous covariates, the model inputs are enriched from multiple perspectives, compensating for the homogeneity deficiency of data and providing complementary information. Finally, a pseudo-supervised meta-learning algorithm is proposed to learn the data features supported by heterogeneous covariates in a task-driven manner. The optimal model is then fine-tuned on downstream real diabetes classification tasks, enabling rapid adaptation to unseen new tasks.
RESULTS: The proposed algorithm is thoroughly evaluated using clinical data, achieving an accuracy of 95.994% and an F1 score of 91.261%.
CONCLUSION: The proposed method remains preferable for diabetes classification when compared to the state-of-the-art methods.
SIGNIFICANCE: The approach offers an effective strategy for diabetes classification tasks with incomplete and limited labeled data.

DOI: https://doi.org/10.1109/TCBBIO.2025.3610741
Comput Biol Med. 2025 Sep 12. pii: S0010-4825(25)01418-0. [Epub ahead of print]197(Pt B): 111066

Accessible healing phase classification of diabetic foot ulcer.

Reza Basiri, Charles de Mestral, Milos R Popovic, Shehroz S Khan.

   OBJECTIVE: Diabetic foot ulcers (DFU) are complex wounds that, without proper treatment, can lead to leg amputation. Treatment of a DFU is multifaceted and requires a high level of clinical expertise. This study aims to inform medical triage and improve healing rates by developing accessible and automated assessment and classification of wound healing phases and analyzing and identifying essential clinical and wound features for the classification.
APPROACH: Machine learning models were evaluated using the Zivot DFU dataset to classify patients' wound healing phases from inflammation, proliferation, and remodeling classes. The models were trained on clinical features from 268 unique patients and 890 data points, including 80 features describing patient demographics, comorbidities, and wound characteristics. The area under the receiver operating characteristic curve, accuracy, and F1 values were used to assess the models' performances, and Shapley Additive Explanations were used to analyze the importance of features.
RESULTS: 56 of the 80 features provided an accuracy of 65 %, while 22 essential features were enough to achieve a lower but statistically similar accuracy with an ordinal three-class classification using random forest classifiers.
INNOVATION: This study provides a novel approach to classifying the wound-healing phase of a DFU based on key features available from wound- and patient-level metadata.
CONCLUSION: The accessible and automated machine learning approach's ease of use and reliability will promote early and continuous autonomous medical triaging, ultimately improving patient outcomes. Additionally, the identified essential clinical features correlating with healing phases provide insight into DFU data management.

Keywords:  Classification; Clinical metadata; Diabetic foot ulcer; Healing phase; Machine learning

DOI:  https://doi.org/10.1016/j.compbiomed.2025.111066
SLAS Technol. 2025 Sep 13. pii: S2472-6303(25)00106-2. [Epub ahead of print] 100348

IoT-based Approach for Diabetes Patient Monitoring Using Machine Learning.

Sarra Ayouni, Muhammad Hamza Khan, Muhammad Ibrahim, Mohamed Maddeh, Nadeem Sarwar, Nazik Alturki.

  This study presents an IoT-based framework for real-time diabetes monitoring and management, addressing key limitations identified in previous studies by integrating four datasets: BVH Dataset, PIMA Diabetes Dataset, Simulated Dataset, and an Integrated Dataset. The proposed approach ensures diverse demographic representation and a wide range of features including real-time vital signs (e.g., oxygen saturation, pulse rate, temperature) and subjective variables (e.g., skin color, moisture, consciousness level). Advanced preprocessing techniques, including Kalman Filtering for noise reduction, KNN imputation for addressing missing data, and SMOTE-ENN for improving data quality and class balance, were employed. These methods resulted in a 25% improvement in Recall and a 20% increase in the F1-score, demonstrating the model's effectiveness and robustness. By applying PCA and SHAP for feature engineering, high-impact features were identified, enabling the tuning of models such as Random Forest, SVM, and Logistic Regression, which achieved an accuracy of 97% and an F1-score of 0.98. A novel triage system, integrated with edge and cloud computing, classifies health status in real-time (Green, Yellow, Red, Black), reducing latency by 35%. The proposed system sets a new benchmark for scalable, individualized diabetes care in IoT-based healthcare solutions, demonstrating significant improvements in accuracy, response time, and feature incorporation compared to prior works.

Keywords:  Diabetes Mellitus; Diabetes decision Support system; Diabetes management; IOT-based framework; Machine Learning

DOI:  https://doi.org/10.1016/j.slast.2025.100348
Front Cell Dev Biol. 2025 ;13 1630708

Investigating the metabolic reprogramming mechanisms in diabetic nephropathy: a comprehensive analysis using bioinformatics and machine learning.

Shan He, Yi Wei Chen, Jian Ye, Yu Wang, Qin Kai Chen, Si Yi Liu.

   Background: Diabetic nephropathy (DN) is a common complication of diabetes, characterized by damage to renal tubules and glomeruli, leading to progressive renal dysfunction. The aim of our study is to explore the key role of metabolic reprogramming (MR) in the pathogenesis of DN.
Methods: In our study, three transcriptome datasets (GSE30528, GSE30529, and GSE96804) were sourced from the Gene Expression Omnibus (GEO) database. These datasets were integrated for batch effect correction and subsequently subjected to differential expression analysis to identify differentially expressed genes (DEGs) between DN and control samples. The identified DEGs were cross-referenced with genes associated with MR to derive MR associated differentially expressed genes (MRRDEGs). These MRRDEGs underwent Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. To identify key genes and develop diagnostic models, four machine learning algorithms were employed in conjunction with weighted gene co-expression network analysis (WGCNA) and the protein interaction tool CytoHubba. Gene set enrichment analysis (GSEA) and CIBERSORT analysis were conducted on the key genes to assess immune cell infiltration in DN. Additionally, a competitive endogenous RNA (ceRNA) network was constructed using the key genes. Finally, the expression levels of core genes in human samples were validated through quantitative real-time PCR (qRT-PCR).
Results: We identified 256 MRRDEGs, highlighting metabolic and inflammatory pathways in DN. KEGG analysis linked these genes to the MAPK signaling pathway, suggesting its key role in DN. Six key genes were pinpointed using WGCNA, PPI, and machine learning, with their diagnostic value confirmed by ROC analysis. CIBERSORT revealed a strong link between these genes and immune cell infiltration, indicating the immune response's role in DN. GSEA showed these genes' involvement in inflammatory and metabolic processes. A ceRNA network was predicted to clarify gene regulation. qRT-PCR confirmed the expression patterns of CXCR2, NAMPT, and CUEDC2, aligning with bioinformatics results.
Conclusion: Through bioinformatics analysis, a total of six potential MRRDEGs were identified, among which CUEDC2, NAMPT, CXCR2 could serve as potential biomarkers.

Keywords:  GEO database; bioinformatics; diabetic nephropathy; metabolic reprogramming; qRT-PCR

DOI:  https://doi.org/10.3389/fcell.2025.1630708