Comput Biol Chem. 2024 Jan 22. pii: S1476-9271(24)00009-4. [Epub ahead of print]109 108021
Functional peptides are easy to absorb and have low side effects, which has attracted increasing interest from pharmaceutical scientists. However, due to the limitations in the laboratory funding and human resources, it is difficult to screen the functional peptides from a large number of peptides with unknown functions. With the development of machine learning and Deep learning, the combination of computational methods and biological information provides an effective method for identifying peptide functions. To explore the value of multi-functional active peptides, a new deep learning method named Deep2Pep (Deep learning to Peptides) was constructed, which was based on sequence encoding, embedding, and language tokenizer. It can achieve predictions of peptides on antimicrobial, antihypertensive, antioxidant and antihyperglycemic by converting sequence information into digital vectors, combined BiLSTM, attention-residual algorithm, and BERT Encoder. The results showed that Deep2Pep had a Hamming Loss of 0.095, subset Accuracy of 0.737, and Macro F1-Score of 0.734. which outperformed other models. BiLSTM played a primary role in Deep2Pep, which BERT encoder was in an auxiliary position. Deep learning algorithms was used in this study to accurately predict the four active functions of peptides, and it was expected to provide effective references for predicting multi-functional peptides.
Keywords: Attention; Bidirectional Encoder Representation from Transformers (BERT); Bioactive Peptide; Long short term memory (LSTM); Multi-label