ISB Home



- Article -





Volume 4


Full article

In Silico Biology 4, 0021 (2004); ©2004, Bioinformation Systems e.V.  



combAlign: A protein sequence comparison algorithm considering recombinations

Katja Wegner1, Stephan Jansen2, Stefan Wuchty3, Ralph Gauges1 and Ursula Kummer1,*

1 EML Research, Schloss-Wolfsbrunnenweg 33, D-69118 Heidelberg, Germany
  Email: wegner@@eml-r.villa-bosch.de. gauges@@eml-r.villa-bosch.de, kummer@@eml-r.villa-bosch.de

2 F. Hoffmann-La Roche Ltd, Grenzacherstr. 124, CH - 4070 Basel, Switzerland
  Email: stephan.jansen@roche.com

3 Department of Physics, University of Notre Dame, 225 Nieuwland Science Hall, Notre Dame, IN 46556, USA
  Email: swuchty@nd.edu

*  corresponding author


Edited by H. Michael; received November 24, 2003; revised January 31, 2004; accepted March 01, 2004; published March 19, 2004


Abstract

The basic linear treatment of sequence comparisons limits the ability of contemporary sequence alignment algorithms to detect non-order-conserving recombinations. Here, we introduce the algorithm combAlign which addresses the assessment of pairwise sequence similarity on non-order-conserving recombinations on a large scale. Emphasizing a two-level approach, combAlign first detects locally well conserved subsequences in a target and a source sequence. Subsequently, the relative placement of alignments is mapped to a graph. Concatenating local alignments to reassemble the target sequence to the fullest extent, the maximum scoring path through the graph denotes the best attainable combAlignment. Parameters influencing this process can be set to meet the user's specific demands. combAlign is applied to examples demonstrating the possibility to reflect evolutionary kinship of proteins even if their domains and motifs are strongly rearranged.

Availability: The source code is available upon request. The binaries are available for download.

Key words: point mutations, shuffling events, dynamic programming, graph theory, DAG