IGPRED-MultiTask: A Deep Learning Model to Predict Protein Secondary Structure, Torsion Angles and Solvent Accessibility

GÖRMEZ, YASİN; AYDIN, ZAFER

doi:10.1109/tcbb.2022.3191395

IGPRED-MultiTask: A Deep Learning Model to Predict Protein Secondary Structure, Torsion Angles and Solvent Accessibility

IEEE/ACM Transactions on Computational Biology and Bioinformatics, cilt.20, sa.2, ss.1104-1113, 2023 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 20 Sayı: 2
Basım Tarihi: 2023
Doi Numarası: 10.1109/tcbb.2022.3191395
Dergi Adı: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, BIOSIS, Biotechnology Research Abstracts, Communication Abstracts, Compendex, EMBASE, INSPEC, MEDLINE, Metadex, Civil Engineering Abstracts
Sayfa Sayıları: ss.1104-1113
Anahtar Kelimeler: Proteins, Predictive models, Deep learning, Solvents, Amino acids, Recurrent neural networks, Feature extraction, Feature extraction or construction, machine learning, protein structure predicition, bioinformatics, deep learning, REAL-VALUE PREDICTION, ACCURATE PREDICTION, NEURAL-NETWORKS
Kayseri Üniversitesi Adresli: Hayır

Özet

IEEEProtein secondary structure, solvent accessibility and torsion angle predictions are preliminary steps to predict 3D structure of a protein. Deep learning approaches have achieved significant improvements in predicting various features of protein structure. In this study, IGPRED-Multitask, a deep learning model with multi task learning architecture based on deep inception network, graph convolutional network and a bidirectional long short-term memory is proposed. Moreover, hyper-parameters of the model are fine-tuned using Bayesian optimization, which is faster and more effective than grid search. The same benchmark test data sets as in the OPUS-TASS paper including TEST2016, TEST2018, CASP12, CASP13, CASPFM, HARD68, CAMEO93, CAMEO93_HARD, as well as the train and validation sets, are used for fair comparison with the literature. Statistically significant improvements are observed in secondary structure prediction on 4 datasets, in phi angle prediction on 2 datasets and in psi angel prediction on 3 datasets compared to the state-of-the-art methods. For solvent accessibility prediction, TEST2016 and TEST2018 datasets are used only to assess the performance of the proposed model. The IGPRED-Multitask method is available at PSP server, which can be accessed by visiting http://psp.agu.edu.tr.