In Silico Biology 7, 0016 (2007); ©2007, Bioinformation Systems e.V.  

Frameshift signals in genes associated with the circular code

Ahmed Ahmed, Gabriel Frey and Christian J. Michel*

Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France

* Corresponding author

Edited by E. Wingender; received October 13, 2006; revised January 22, 2007; accepted February 18, 2007; published April 11, 2007


Three sets of 20 trinucleotides are preferentially associated with the reading frames and their 2 shifted frames of both eukaryotic and prokaryotic genes. These 3 sets are circular codes. They allow retrieval of any frame in genes (containing these circular code words), locally anywhere in the 3 frames and in particular without start codons in the reading frame, and automatically with the reading of a few nucleotides. The circular code in the reading frame, noted X, which can deduce the 2 other circular codes in the shifted frames by permutation, is the information used for analysing frameshift genes, i. e. genes with a change of reading frame during translation. This work studies the circular code signal around their frameshift sites. Two scoring methods are developed, a function P based on this code X and a function Q based both on this code X and the 4 trinucleotides with identical nucleotides. They detect a significant correlation between the code X and the −1 frameshift signals in both eukaryotic and prokaryotic genes, and the +1 frameshift signals in eukaryotic genes.

Keywords: frameshift gene, frameshift signal, circular code, trinucleotide, frame, statistical method