Daniela Kalafatovic1
University of Rijeka1
The discovery of new active peptides (i.e., antimicrobial, antiviral, catalytic) is challenging, as they are part of a very large search space and the correlation between the peptide sequence and the desired activities and/or functions is not yet fully understood. To avoid expensive and time-consuming guesswork and experimental failure, our strategy is to apply soft computing techniques to accelerate peptide discovery. Soft computing is a set of probabilistic algorithms, which are robust to imprecision and tolerant to uncertainty, that enable us to grapple with analytically intractable problems and make up for the lack of theoretical knowledge. In our project we apply a wide range of soft computing models to predict peptide activity, construct novel peptides and cover the chemical search space (Kalafatovic et al., J. Cheminform, 2019). More in detail, we tackle the problems of (1) sensitivity of highly accurate predictive models, (2) building predictive models with low amount of available data, (3) interpretability of neural network-based classifiers, (4) ability to generate new peptide sequences, and (5) coverage-based parallel exploration of chemical space.<br/>Focusing on therapeutic peptides, we addressed the issue of sensitivity of highly accurate predictive models (Erjavac et al., AI Life Sciences, 2022) and proposed the sequential properties representation scheme to improve their predictive power (Otovic et al., J. Chem. Inf. Model, 2022). This provided the foundations to employ deep learning models improved by transfer learning for the prediction of underrepresented categories or poorly researched peptide functions, such as their predisposition towards self-assembly or catalysis. To gain insight into the decision process of black-box neural network models we employ the Grad-Cam technique, which enables us to pinpoint the properties and residues important for the prediction results and analyze their behavior. We envision that these strategies will maximize the chance of successful identification of functional peptides, partly reducing the environmental impact of failed experimental attempts.