bims-aukdir Biomed News
on Automated knowledge discovery in diabetes research
Issue of 2026–06–14
thirteen papers selected by
Mott Given



  1. Front Endocrinol (Lausanne). 2026 ;17 1821550
       Background: Diabetic foot ulcers are a leading cause of non-traumatic lower-limb amputation, but early identification of inpatients at high risk of major amputation remains challenging.
    Methods: We retrospectively reviewed consecutive admissions for diabetic foot ulcers at a single center, developing models in a 2019-2020 cohort and temporally validating them in a later 2024 cohort. The outcome was in-hospital major lower-extremity amputation above the ankle. Candidate predictors were routinely available admission variables within 24 hours, including comorbidities, bedside limb/ulcer assessment, and standard laboratory tests. We compared logistic regression, elastic net, random forest, and extreme gradient boosting models and used Shapley additive explanations to provide patient-level interpretability.
    Results: The random forest model showed the best overall discrimination, with an area under the receiver operating characteristic curve of 0.977 in internal testing and 0.984 in temporal validation, and acceptable calibration. The most influential predictors reflected limb perfusion and infection severity and included perfusion grade, ankle-brachial index, maintenance dialysis, white blood cell count, C-reactive protein, and prior minor amputation.
    Conclusions: An explainable admission-data model can support early inpatient risk stratification for major amputation in diabetic foot ulcer patients and may help prioritize timely multidisciplinary care.
    Keywords:  Shapley additive explanations; diabetic foot ulcer; explainable model; hospitalization; major amputation; random forest; risk prediction; temporal validation
    DOI:  https://doi.org/10.3389/fendo.2026.1821550
  2. J Diabetes Res. 2026 ;2026(1): e7196309
       OBJECTIVE: This research focused on establishing and testing a machine learning-driven predictive tool aimed at assisting in the identification of diabetic kidney disease (DKD).
    METHODS: The prediction models were developed and internally temporally validated using single institution data. A total of 1463 patients from Shaanxi Provincial People's Hospital between March 2023 and September 2024 were incorporated in our study. Least absolute shrinkage and selection operator regression with 10-fold cross-validation was used to select the optimal features. We compared extreme gradient boosting, random forest (RF), support vector machine, and logistic regression across a range of metrics: area under the receiver operating characteristic curve (AUC-ROC), area under the precision-recall curve (AUC-PR), accuracy, precision, recall, kappa values, and F1-score. For each algorithm, a simplified model was developed using only routinely available clinical variables and was trained and evaluated on the same datasets as the full model. Decision curve analysis and calibration curve served to evaluate the clinical utility of the optimal models. Analysis and interpretation of feature importance were performed via SHapley Additive exPlanations and Local Interpretable Model-agnostic Explanations.
    RESULTS: When screening for DKD in Type 2 diabetes, the full RF model achieved superior performance (AUC-ROC = 0.906, AUC-PR = 0.902, accuracy = 0.830, F1 = 0.847, precision = 0.794, recall = 0.907, and kappa = 0.657) and significantly outperformed the simplified RF model. It also exhibited a favorable clinical net benefit and well-calibrated performance. The most influential predictors identified in the full RF model were urine α1-microglobulin, hypertension, 24-h urinary total protein, duration of Type 2 diabetes mellitus, systolic blood pressure, serum retinol-binding protein, complement C1q, and 25-hydroxyvitamin D.
    CONCLUSION: A RF prediction model was developed to facilitate early screening of DKD, highlighting the significant roles of specific clinical and laboratory factors in disease prediction.
    Keywords:  LIME; SHAP; diabetic kidney disease; machine learning model; random forest
    DOI:  https://doi.org/10.1155/jdr/7196309
  3. Sci Rep. 2026 Jun 09.
      Diabetic retinopathy (DR) is still one of the main reasons for vision loss worldwide, especially in places where people do not have easy access to regular eye checkups. Early and accurate disease detection is important to avoid permanent damage, but traditional methods are slow and sometimes inconsistent. This study proposes a deep learning framework that combines convolutional neural networks (CNNs), vision transformers, transfer learning, and ensemble techniques to improve DR detection. We used the APTOS 2019 dataset and tested the capabilities of 23 different pre-trained models. Then, we fine-tuned the top models and designed hybrid architectures by combining the best-performing CNNs and transformers in parallel and sequential ways to capture both image spatial features in short and long contexts. The best performance came from combining the top sequential hybrid models using the soft voting architecture, where we got an accuracy of 93.10%, ROC AUC of 99.22%, and F1-score of 93.07%. The optimized model showed that mixing different models and using ensemble methods can lead to better and more stable DR detection decisions. Our approach is a step toward building a reliable and automated system that could help doctors in real-world settings.
    DOI:  https://doi.org/10.1038/s41598-026-55085-9
  4. Sci Rep. 2026 Jun 11.
      Diabetic retinopathy (DR) has been known as one of the leading preventable causes of vision impairment globally and requires automated screening systems that are reliable to heterogeneous imaging conditions and severely imbalanced classes. While recent deep learning techniques have proven their capabilities within datasets, their limitations regarding large-scale remote screening of ophthalmology are encountered because of their poor generalization from one dataset to another and low sensitivity to minority disease stages. In this publication, we introduce Swin-DRNet, a robust transformer-based framework for automated screening of DR that incorporates multi-stage contrast-adaptive preprocessing, class-aware representation learning, and focal-optimized loss design, all built into the hierarchical architecture of Swin Transformers. This method is optimally suited for improving visibility of lesions, stabilizing the learning process across imbalanced classes, and increasing the robustness of screening performance when using multiple sources of fundus photographs as input. Results from extensive testing of this new approach on three publicly available benchmark datasets APTOS 2019, IDRiD, and Messidor2 demonstrate that the Swin-DRNet achieves consistently high-quality results across all datasets, with an overall accuracy of 96.68%, an F1-score of 96.20%, and a ROC-AUC score of 99.83% on the combined dataset, while also maintaining high recall rates for clinically important advanced DR stages. These findings suggest that the Swin-DRNet provides a reliable and scalable solution for providing diabetic retinopathy screening in the real world, using tele ophthalmology practices that include heterogeneous imaging platforms. To ensure transparency and reproducibility, the complete implementation of the proposed Swin-DRNet framework is publicly available at: https://github.com/damomtpcse/Swin-DRNe.
    Keywords:  Attention mechanism; Diabetic Retinopathy; Focal Loss Optimization; Gradient optimization; Loss Function Engineering; Retinal Disease Detection; Swin Transformer
    DOI:  https://doi.org/10.1038/s41598-026-57053-9
  5. Adv Wound Care (New Rochelle). 2026 Jun 08. 21621918261455459
       OBJECTIVE: Diabetic foot complications (DFCs) are common diabetes complications. Existing tools for predicting incident DFCs remain insufficient. This study aimed to develop and validate a novel machine learning-based model for incident DFC prediction.
    APPROACH: Using UK Biobank data, we built a longitudinal incident DFC cohort, with DFCs identified using International Classification of Diseases codes. Clinical features were screened by Cox models, and a machine learning model (DFC-Clin) was developed using fivefold cross-validation and leave-one-center-out validation. Performance was compared with diabetic foot risk stratification tools using the DeLong test. A web-based tool and risk stratification system were also developed.
    RESULTS: Among 502,175 participants, 29,766 individuals formed the incident cohort, with 1,252 incident DFC events. Demographics, blood markers, lifestyle factors, and comorbidities were selected to construct DFC-Clin, with glycated hemoglobin and body mass index emerging as the most predictive features. The model showed improved discrimination compared with existing risk stratification tools, achieving area under the receiver operating characteristic curves of 0.782 ± 0.042, 0.766 ± 0.042, and 0.747 ± 0.021 for 5-year, 10-year, and overall incident DFC prediction, respectively.
    INNOVATION: DFC-Clin is a machine learning model for incident DFC prediction that uses accessible clinical features from a large population-based cohort and is coupled with a web-based application and risk stratification system.
    CONCLUSION: DFC-Clin estimates the risk of incident DFC across multiple time horizons and demonstrates improved discrimination compared with existing approaches. The web-based application and stratification framework are intended to support risk identification and preventive decision-making. Further studies are required for clinical deployment and evaluation on more clinical outcomes, including amputations, recurrence, and health care costs.[Figure: see text][Figure: see text][Figure: see text].
    Keywords:  UK Biobank; diabetic foot; machine learning; prediction model
    DOI:  https://doi.org/10.1177/21621918261455459
  6. IEEE Trans Image Process. 2026 Jun 08. PP
      Recent advances in multi-view fundus imaging show great promise for automated diabetic retinopathy (DR) grading. However, mainstream end-to-end CNN/Transformer pipelines rely on striding or tokenization that compresses spatial detail, causing small, low-contrast lesions (e.g., microaneurysms) to be under-represented and creating performance ceilings. Prior efforts have mitigated this by incorporating external lesion- or vessel-level annotations into models. However, such labels are costly to acquire, break the end-to-end training, and make performance over-reliant on the annotation quality. To reduce dependence on expensive annotations, we propose an end-to-end framework that generates lesion proposals on the fly during training and inference, providing self-derived cues for grading. First, we introduce a Grade-Activated Lesion Proposal (GALP) module that derives grade-conditioned evidence maps (GEMs) from stage-wise auxiliary classifiers and selects the top-K high-evidence regions per view as lesion proposals. Second, we propose a Cross-View Lesion Expert Guided Regional Fusion (LGRF) module, which selectively activates experts for a view's lesion proposals based on contextual guidance from other views, ensuring that only the most relevant feature extractors contribute to fusion. Experimental results on two multi-view DR datasets show that our method matches or surpasses strong baselines without external annotations, confirming that self-generated proposals can substantially reduce annotation needs.
    DOI:  https://doi.org/10.1109/TIP.2026.3699089
  7. J Med Syst. 2026 Jun 12. pii: 96. [Epub ahead of print]50(1):
      The prevalence of gestational diabetes mellitus (GDM) continues to rise, necessitating reliable and effective self-management strategies to improve maternal and neonatal outcomes. However, current self-management models face challenges, including insufficient data monitoring and analysis, delayed modifications to treatment protocols, and excessive reliance on manual processes. With the expanding application of artificial intelligence (AI) in healthcare, its potential value in the self-management of GDM has attracted increasing attention. This systematic review aimed to synthesize the evidence on the application of AI technologies in the self-management of patients with GDM. This systematic review was conducted in April 2025 and included comprehensive literature searches across PubMed, Embase, The Cochrane Library, Scopus, Web of Science, CINAHL, CBM, CNKI, VIP, and Wanfang databases. The search strategy combined Medical Subject Headings and free-text terms related to GDM, AI, machine learning, and self-management. Quantitative studies that explored the application of AI in the self-management of patients with GDM were included, including randomized controlled trials and cohort studies. Two researchers independently performed study selection and data extraction, followed by quality assessment using risk-of-bias instruments appropriate for each study design. Data were synthesized using a narrative approach combined with thematic synthesis. The initial search yielded 18,973 records. After stepwise screening, 10 studies were included. A total of 645 patients with GDM completed AI-assisted interventions (from 661 initially enrolled), along with 864 control participants (from 877 enrolled). A variety of AI technologies were employed, including expert systems, machine learning, and natural language processing. Their primary functions included abnormality detection and alert triggering, personalized treatment plan generation and adjustment, and data integration and management. The studies reported multiple outcomes. Regarding health outcomes, six studies reported that AI interventions were associated with improved glycemic control, although heterogeneity was observed in delivery outcomes and insulin utilization rates. In terms of adherence, AI interventions tended to increase the frequency of blood glucose monitoring and data upload rates. Regarding system usability, limited data suggested that the accuracy of dietary recommendations and detection of blood glucose abnormalities was satisfactory, whereas the adoption rate of insulin treatment adjustment recommendations was relatively low. User satisfaction was generally high. Facilitators for implementation included technological advantages, user experience, and external support, whereas barriers included data integration and quality issues, technical and hardware or software limitations, patient acceptance, and difficulties in clinical integration. Preliminary evidence suggests that AI may contribute to the self-management of GDM; however, its practical application faces several obstacles. Future efforts should focus on conducting high-quality clinical research and evaluating implementation-related experiences to facilitate the integration of AI into GDM self-management.
    Keywords:  Artificial Intelligence; Gestational Diabetes Mellitus; Self-Management; Systematic Review
    DOI:  https://doi.org/10.1007/s10916-026-02419-9
  8. Sci Rep. 2026 Jun 09.
      Gestational diabetes mellitus, often known as GDM, is a major health issue that causes complications for mothers and requires patient data prediction models that are complex and variable. The research in question makes use of graph-based learning in order to investigate the ways in which genetic, biochemical, and demographic elements interact in a variety of different contexts. Through the use of nodes to represent patients and lines to represent the things that they share in common, the framework illustrates how the aforementioned elements influence the likelihood of illness. Graph neural networks are utilized for the process, while BioBERT embeddings are utilized for the management of unstructured clinical notes. Graph neural networks are utilized for organized clinical notes. Because of this alignment, healthcare processes are placed in the context in which they should be, rather than being taken out of context while they are being carried out. The graph architecture used in BioBERT incorporates semantic patterns derived from medical information into a relational structure that illustrates the degree to which patients are similar to one another. After being evaluated on a substantial clinical dataset, the proposed method is able to make more accurate and readable predictions than the baseline models. The results of this study indicate that the utilization of graph architecture with both organized and unstructured data can assist in the discovery of novel approaches to the treatment of GDM that go beyond performance sets. According to the findings of the study, machine learning needs to be modified so that it can be used with healthcare applications.
    Keywords:  Explainable AI; Federated learning; Gestational diabetes mellitus; Graph neural networks; Process; SDG 3; Temporal analysis
    DOI:  https://doi.org/10.1038/s41598-026-57000-8
  9. Sci Rep. 2026 Jun 10.
      Extracellular matrix (ECM) remodeling contributes to retinal vascular basement membrane thickening, an early structural hallmark of diabetic retinopathy (DR). This study aimed to identify key ECM-related genes (ECMGs) associated with DR. Transcriptomic data of DR and ECMGs from MatrixDB were integrated to identify differentially expressed ECMGs. Six machine learning (ML) models, including Extra Trees (ET), Logistic Regression, Adaptive Boosting, Random Forest, Extreme Gradient Boosting, and naive Bayes classifier, were employed to construct DR classification models, with SHapley Additive exPlanation (SHAP) used to interpret feature contributions. Functional enrichment analysis using GSEA and immune infiltration analysis using CIBERSORT were conducted to explore the potential mechanisms by which key ECMGs regulate DR. Regulatory networks were constructed using predicted miRNAs, lncRNAs, and transcription factors (TFs) via the ENCORI, miRWalk, and miRNet databases. Drug-key ECMGs-DM-related diseases interactions were further explored using the DGIdb and CTD databases. Nine candidate ECMGs were identified by overlapping 356 DM-associated DEGs, 1,626 DR-associated DEGs, and 1,023 ECMGs, including CILP2, FN1, DEFA3, COL17A1, CRISP3, TPSAB1, SFRP1, GPHA2, and ECM2. Among the six ML algorithms, the ET classifier exhibited the best overall performance, and five ECMGs (SFRP1, CILP2, FN1, TPSAB1, and ECM2) with non-zero SHAP values were retained as key genes. These genes showed distinct expression patterns across the healthy, DM, and DR groups, and were enriched in neural-related pathways, such as axon guidance, glycosphingolipid biosynthesis ganglio series, and neuroactive ligand receptor interaction. Immune profiling and correlation analysis revealed that FN1, TPSAB1, and CILP2 were correlated with memory/naive B cells, CD8 + T cells, activated memory CD4 + T cells, Tregs, monocytes, and neutrophils. Additionally, the ceRNA network contained five miRNAs, 7 lncRNAs, and two ECMGs, and further regulatory and pharmacologic analysis further linked key ECMGs to specific TFs, drugs, and diabetes-related diseases. This study identified SFRP1, CILP2, FN1, TPSAB1, and ECM2 as key ECMGs in DR, revealing their coordinated involvement in ECM remodeling, neural signaling, and immune modulation. These findings provide novel insights into DR pathogenesis and potential therapeutic targets.
    Keywords:  Diabetic retinopathy; Extracellular matrix; Gene expression profiling; Machine learning
    DOI:  https://doi.org/10.1038/s41598-026-57016-0
  10. Diagnostics (Basel). 2026 May 27. pii: 1654. [Epub ahead of print]16(11):
      Background/Objectives: Disorganization of retinal inner layers (DRIL) is an important and supportive biomarker in optical coherence tomography (OCT) imaging for diagnosing the extent of diabetic macular edema (DME) in patients and anticipating visual outcomes. But the manual DRIL identification is subject to interobserver bias and requires a lot of time and effort from the experts. This research presents a novel, computerized, and clinically guided approach for the classification of DRIL that leverages the central 1 mm foveal region extracted through the annotations provided by the expert ophthalmologists and investigates the effectiveness of a transformer and Masked Auto Encoder (MAE) based foundation model (RETFound) as the primary approach. Methods: We fine-tuned and validated the RETFound model, utilizing accurate foveal center coordinates provided by the experienced ophthalmologists. Our approach emphasizes the macular region that is significant diagnostically, where DME biomarkers manifest more predominantly. To guarantee robust evaluation, the dataset was divided into 85% training and 15% held-out test sets. We performed 5-fold cross-validation exclusively on the training dataset with baseline, conservative, and moderate fine-tuning strategies, and the final model was evaluated on the independent, unseen test set. Convolutional neural network (CNN)-based transfer learning (TL) models (MobileNetV2, EfficientNetB0, InceptionV3, DenseNet121, and DenseNet169) were also assessed for comparative evaluation. Results: The RETFound model yielded the best outcomes under the conservative fine-tuning strategy, achieving a mean test accuracy (AC) of 0.9339 ± 0.0036 and an area under the curve (AUC) of 0.9660 ± 0.0028 on the independent held-out test set across the five fold-trained models. The moderate and baseline evaluations achieved comparatively lower outcomes, highlighting the effectiveness of the conservative approach. The RETFound model consistently outperformed CNN models, exhibiting stability and superior generalization for DRIL classification. We performed statistical validation using the Wilcoxon signed-rank test and 95% confidence intervals to confirm the robustness of the proposed method, and an ablation analysis showed that the fovea-centered region of interest (ROI) guidance consistently improved results when compared with whole OCT analysis. Conclusions: This research demonstrates that the deep-learning (DL) methods assisted by expert clinical knowledge with an anatomically aligned ROI could provide remarkable results in DRIL detection applications. This work attempts to establish an anatomically relevant framework for computerized DRIL identification that focuses on the highly crucial macular region, possibly helping in faster intervention and improved diagnosis in the management of DME.
    Keywords:  biomarker; deep learning; diabetes; diabetic macular edema; disease; disorganization of retinal inner layers; health; optical coherence tomography; transfer learing; vision transformer
    DOI:  https://doi.org/10.3390/diagnostics16111654
  11. Transl Vis Sci Technol. 2026 Jun 01. 15(6): 12
       Purpose: To evaluate and compare the clinical validity of synthetic ultra-widefield fluorescein angiography (UWFA) images generated from ultra-widefield fundus photography (UWFP) using generative adversarial networks (GANs) in patients with diabetic retinopathy (DR).
    Methods: Two GAN-based models, RegGAN and UWAFA-GAN, were trained to generate synthetic UWFA images from corresponding UWFP acquired using Optos California P200DTx. The dataset included 2084 image pairs (no DR: 124; nonproliferative DR: 795; severe NPDR: 770; proliferative DR: 395). Technical image similarity was assessed using peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Clinical similarity was assessed by expert graders using the DR severity scale, comparing the synthetic images with corresponding real UWFA images.
    Results: Both models demonstrated acceptable performance in generating synthetic UWFA images from UWFP (F1-score range: 0.281-0.709; SSIM: 0.43-0.585; PSNR: 18.15-19.90). UWAFA-GAN achieved higher quantitative similarity metrics (SSIM and PSNR), indicating superior overall image fidelity, whereas RegGAN showed stronger clinical correlation with ground-truth UWFA, achieving higher diagnostic accuracy (66.7% vs. 49.2%) and more balanced precision, recall, and F1-scores across DR severity levels. This suggests that while UWAFA-GAN excels in pixel-level resemblance, RegGAN better preserves diagnostically relevant features.
    Conclusions: GAN models successfully generated UWFA images from UWFP in patients with DR. With further improvements, such models may serve as a complementary noninvasive imaging tool in selected clinical settings and may extend accessibility in resource-limiting settings.
    Translational Relevance: GAN-based synthetic angiography may serve as a noninvasive alternative to UWFA, particularly when conventional angiography is contraindicated or unavailable.
    DOI:  https://doi.org/10.1167/tvst.15.6.12
  12. Comput Biol Chem. 2026 Jun 02. pii: S1476-9271(26)00271-9. [Epub ahead of print]124(Pt 2): 109145
      Type 2 diabetes (T2D) is a well-established metabolic risk factor for pancreatic cancer; however, the transcriptomic mechanisms linking these conditions and their utility for patient-level risk stratification remain incompletely understood. In this paper, a novel, interpretable risk estimation platform using interpretable, probabilistic Tsetlin machines is introduced, which can be used to perform cross-disease risk estimation from transcriptomic crosstalk data of diabetic patients towards pancreatic adenocarcinoma. The Tsetlin machine provides interpretability by highlighting crucial genes through clause support counts. This will enable increased screening for early detection of cancers and also enable personalised therapeutics for patients. To the best of our knowledge, this is the first interpretable AI-based platform that performs cross-disease risk estimation based on transcriptomic crosstalk between diabetes and pancreatic adenocarcinoma.
    Keywords:  Cross-disease risk estimation; Gene expression; Interpretable AI model; Pancreatic cancer; Probabilistic Tsetlin Machines; Transcriptomic crosstalk; Type 2 diabetes
    DOI:  https://doi.org/10.1016/j.compbiolchem.2026.109145
  13. Medicine (Baltimore). 2026 Jun 05. 105(23): e49208
      The increasing incidence of diabetic foot ulcer (DFU) and growing recognition of environmental pollutants have highlighted polyethylene terephthalate microplastics (PET-MP) as a potential metabolic disease trigger. However, the molecular mechanisms linking PET-MP to DFU remain unclear. This study employed integrated network toxicology and bioinformatics to decipher these mechanisms. PET-MP toxicity targets were screened using SwissTargetPrediction and ChEMBL, and DFU-related differentially expressed genes were obtained from GSE199939 and GSE134431. Functional analysis of overlapping genes included gene ontology, Kyoto encyclopedia of genes and genomes, gene set variation analysis, and protein-protein interaction network analysis. Machine learning models (least absolute shrinkage and selection operator, random forest, and support vector machine-recursive feature elimination) and SHapley Additive exPlanations analysis identified key genes, validated via nomogram, molecular dynamics simulation, and molecular docking. From 6723 DFU-related differentially expressed genes, 53 overlapping genes were identified. Functional analysis highlighted pathways including apoptosis, advanced glycation end product-receptor for advanced glycation end-product signaling, arachidonic acid metabolism, and nicotinamide adenine dinucleotide poly-ADP-ribosyltransferase activity. Machine learning and SHapley Additive exPlanations analysis identified PARP10 and PFKFB4 as key genes. Molecular docking revealed moderate binding affinities (Vina scores: -6.8 and -5.6). Molecular dynamics simulations confirmed conformational stability. PET-MP may exacerbate DFU by disrupting DNA damage repair, enhancing oxidative stress, and impairing glucose metabolism. These in silico findings identify PARP10 and PFKFB4 as potential candidate genes associated with PET-MP-related pathways in DFU, warranting further experimental validation.
    Keywords:  diabetic foot ulcer; machine learning; molecular docking; polyethylene terephthalate microplastics; toxicology
    DOI:  https://doi.org/10.1097/MD.0000000000049208