BMC Biol. 2024 Apr 19. 22(1): 86
BACKGROUND: The blood-brain barrier serves as a critical interface between the bloodstream and brain tissue, mainly composed of pericytes, neurons, endothelial cells, and tightly connected basal membranes. It plays a pivotal role in safeguarding brain from harmful substances, thus protecting the integrity of the nervous system and preserving overall brain homeostasis. However, this remarkable selective transmission also poses a formidable challenge in the realm of central nervous system diseases treatment, hindering the delivery of large-molecule drugs into the brain. In response to this challenge, many researchers have devoted themselves to developing drug delivery systems capable of breaching the blood-brain barrier. Among these, blood-brain barrier penetrating peptides have emerged as promising candidates. These peptides had the advantages of high biosafety, ease of synthesis, and exceptional penetration efficiency, making them an effective drug delivery solution. While previous studies have developed a few prediction models for blood-brain barrier penetrating peptides, their performance has often been hampered by issue of limited positive data.RESULTS: In this study, we present Augur, a novel prediction model using borderline-SMOTE-based data augmentation and machine learning. we extract highly interpretable physicochemical properties of blood-brain barrier penetrating peptides while solving the issues of small sample size and imbalance of positive and negative samples. Experimental results demonstrate the superior prediction performance of Augur with an AUC value of 0.932 on the training set and 0.931 on the independent test set.
CONCLUSIONS: This newly developed Augur model demonstrates superior performance in predicting blood-brain barrier penetrating peptides, offering valuable insights for drug development targeting neurological disorders. This breakthrough may enhance the efficiency of peptide-based drug discovery and pave the way for innovative treatment strategies for central nervous system diseases.
Keywords: Blood–brain barrier; Data augmentation; Feature selection; Information gain; Machine learning; Penetrating peptides