ISB Home



- Article -





Volume 6


Full article

In Silico Biology 6, 0013 (2006); ©2006, Bioinformation Systems e.V.  



Analysis and prediction of helix shift errors in homology modeling

Christoph Bock* and Jürgen Hesser1

Institute for Computational Medicine, Universities of Mannheim and Heidelberg, Germany
1 Email: jhesser@rumms.uni-mannheim.de

* Corresponding author
   Email: cbock@rumms.uni-mannheim.de


Edited by E. Wingender; received November 26, 2005; revised January 25/February 19, 2006; accepted February 19, 2006; published March 01, 2006


Abstract

High sequence identity between two proteins (e. g. >60%) is a strong evidence for high structural similarity. However, internal shifts in one of the two proteins can sometimes give rise to unexpectedly high structural differences. This, in turn, causes unreliable structure predictions when two such proteins are used in homology modeling. Here, we perform a computational analysis of helix shifts and we show that their occurrence can be predicted with statistical learning methods.

Our results indicate that helix shifts increase the RMS error by factor 2.6 compared to those protein pairs without a helix shift. Although helix shifts are rare (1.6% of helices and a commensurately higher number of proteins are affected), they therefore pose a significant problem for reliable structure prediction systems. In this paper, we prototype a new approach for model quality assessment and demonstrate that it can successfully warn against helix shifts. A support vector machine trained on a wide range of sequence and structure properties predicts the occurrence of helix shifts with a sensitivity of 74.2% and a specificity of 83.6%. On an equalized test dataset, this corresponds to an accuracy of 78.9%. Projected to the full dataset, it translates to an accuracy of 83.4%.

Our analysis shows that helix shift detection is a valuable building block for highly reliable structure prediction systems. Furthermore, the statistical learning based approach to helix shift detection that we employ here is orthogonal to well-established model quality assessment methods (which use geometric constraint checking or mean force potentials). Therefore, a further increase of prediction accuracy is expected from the combination of these methods.


Keywords: comparative modeling, homology modeling, error sources, MQAP, helix movement, secondary structure