The use of the DeBERTa model for antifungal peptide screening is feasible. The effectiveness of the BERT model in antimicrobial peptide screening has already been experimentally validated, and the application of deep learning in antifungal peptides has also proven feasible.
With DeBERTa's powerful disentangled attention mechanism and contextual embedding features, researchers may train models with fewer data samples, leading to a more efficient and accurate screening of antifungal peptides. This reduces the time and cost required for experimental validation. Based on this, we trained a screening model for antifungal peptides using DeBERTa.
Machine learning methods have been widely applied in antimicrobial peptide (AMP) research. Traditional machine learning models, such as Support Vector Machines (SVM), Random Forests (RF), and Neural Networks (NN), classify AMPs by extracting physicochemical properties and amino acid compositions from peptide sequences. However, these traditional methods often overlook important sequence information.
In contrast, deep learning can automatically extract key features from raw sequences, avoiding the over-reliance on feature extraction inherent in traditional methods. For example, the deep learning-based AMPs-Net model can automatically extract effective features from sequences, significantly reducing the complexity of feature learning. However, compared to AMPs, the application of machine learning in antifungal peptide (AFP) screening is relatively limited.
BERT-based language models have already achieved significant results in AMP screening. BERT captures long-range dependencies within peptide sequences, making AMP sequence classification more accurate. These models can automatically learn contextual information from sequences, helping improve both prediction accuracy and generalization ability.
Our model was trained on over 6,000 antifungal peptide data entries and more than 30,000 negative data entries. The training results demonstrated high recognition accuracy in the test set, with a positive prediction accuracy of 82.7%, a negative prediction accuracy of 99.8%, and an overall accuracy of 99.4%.
The model predicted potential antimicrobial peptide sequences, including MIC values. The sequences and predicted MIC values are presented in the table below:
Wet lab experiments were conducted for validation, showing that, based on the standard MIC ≤ 64 ng/ul for determining antimicrobial activity, the positive rate for Candida albicans reached over 30%.