ISB Home

- Article -

Volume 7

Full article

In Silico Biology 7, 0048 (2007); ©2007, Bioinformation Systems e.V.  

Prediction of ubiquitin proteins using artificial neural networks, hidden Markov model and support vector machines

Kunal Jaiswal

Department of Bioinformatics & Biotechnology, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, India - 173 215

Phone: +91-9971241380

Edited by E. Wingender; received January 13, 2007; revised May 03, 2007; accepted August 16, 2007; published October 08, 2007


Ubiquitin functions to regulate protein turnover in a cell by closely regulating the degradation of specific proteins. Such a regulatory role is very important, and thus I have analyzed the proteins that are ubiquitin-like, using an artificial neural network, support vector machines and a hidden Markov model (HMM). The methods were trained and tested on a set of 373 ubiquitin proteins and 373 non-ubiquitin proteins, obtained from Entrez protein database. The artificial neural network and support vector machine are trained and tested using both the physicochemical properties and PSSM matrices generated from PSI-BLAST, while in the HMM based method direct sequences are used for training-testing procedures. Further, the performance measures of the methods are calculated for test sequences, i. e. accuracy, specificity, sensitivity and Matthew's correlation coefficients of the methods are calculated. The highest accuracy of 90.2%, specificity of 87.04% and sensitivity of 94.08% was achieved using the support vector machine model with PSSM matrices. While accuracies of 86.82%, 83.37%, 80.18% and 72.11% were obtained for the support vector machine with physicochemical properties, neural network with PSSM matrices, neural networks with physicochemical properties, and hidden Markov model, respectively. As the accuracy for SVM model is better both using physicochemical properties and the PSSM matrices, it is concluded that kernel methods such as SVM outperforms neural networks and hidden Markov models.

Keywords: ubiquitin proteins, support vector machine, artificial neural networks, hidden Markov model, physicochemical properties, PSI-BLAST, PSSM Matrices, binary classifier