ProML - the Protein Markup Language for specification of protein sequences, structures and families
Daniel Hanisch1, Ralf Zimmer2 and Thomas Lengauer3
1Fraunhofer Institute for Algorithms and Scientific Computing (SCAI),Schloss Birlinghoven, D-53754 Sankt Augustin, Germany
We propose a specification language ProML for protein sequences, structures, and families based on the open XML standard. The language allows for portable, system-independent, machine-parsable and human-readable representation of essential features of proteins. The language is of immediate use for several bioinformatics applications: we discuss clustering of proteins into families and the representation of the specific shared features of the respective clusters. Moreover, we use ProML for specification of data used in fold recognition bench-marks exploiting experimentally derived distance constraints.
Key words: Protein Markup Language, ProML, XML, protein properties, protein families, protein structures, distance constraints, protein clusters