SPI – Structure predictability index for protein sequences
Michal Brylinski1, Leszek Konieczny2 and Irena Roterman3, *
1 Department of Bioinformatics and Telemedicine, Collegium Medicum – Jagiellonian University, Kopernika 17, 31-501 Cracow, Poland
Estimation of structure predictability for a particular protein is difficult. Many methods estimate it in an a posteriori system evaluating the final, native protein structure. The SPI scale is intended to estimate the structure predictability of a particular amino acid sequence in an a priori system. A sequence-to-structure library was created based on the complete Protein Data Bank. The tetrapeptide was selected as a unit representing a well-defined structural motif. The early-stage folding structure (a model of which was presented elsewhere) was taken as the object for protein structure classification. Seven structural forms were distinguished for structure classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations particularly interesting for threading methods. A comparative analysis of the SPI and Q7 scales with the commonly used SOV and Q3 scales is presented. The complete contingency table, supplementary materials and all the programs used are available on request.
Keywords: protein structure prediction, predictability scale, early-stage of folding