Feature Propagation ­ Semi Automatic transfer of position dependent SWISS­PROT annotation

Arno Velds, Henning Hermjakob and Rolf Apweiler




EMBL Outstation Hinxton
The European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton, Cambridgeshire CB10 1SD, UK
Phone: +44 (0)1223 494427
Fax: +44 (0)1223 494 468
E-mail: avelds@ebi.ac.uk






DESCRIPTION

SWISS­PROT [Bairoch and Apweiler, 1999] is a manually curated protein sequence database that provides high quality annotation. TrEMBL [Bairoch and Apweiler, 1999] is a manually checked translation of the EMBL database [Stoesser et al.,1999] to complement the SWISS­PROT database. Data is added to TrEMBL faster than that it could be manually annotated. Currently position independent annotation is added to TrEMBL automatically by a process called distributed annotation [Moeller et al., 1999].

Position specific SWISS­PROT annotation can be copied by similarity. The featurePropagation package allows controlled propagation of these position specific features to other entries. It uses an alignment of a master entry and a set of target entries. This alignment is made using either CLUSTALW [Thompson et al., 1994] or FASTA [Pearson, 1990]. The alignment is used for positioning the master features on the target sequence and for calculating the score of each individual feature.

The copying process is controlled by a set of rules. These rules describe the features and can set thresholds for feature similarity (local), protein similarity (global) or check for a pattern in the target protein sequence. The newly created feature lines are tagged with master entry details for future referencing.

The program is written in JAVA and is usable on most platforms. It is used for annotating TrEMBL and for doing curator controlled bulk annotation in SWISS­PROT.


AVAILIBILITY

The featurePropagation package is available on request. Contact avelds@ebi.ac.uk .


ACKNOWLEDGEMENTS

The package is largely build upon the SWISS package by S. Möller (moeller@ebi.ac.uk ). The SWISS package is a set of JAVA classes that provide reading, modifying and writing functionality for SWISS­PROT formatted entries.


REFERENCES