Frameshift signals in genes associated with the circular code
Ahmed Ahmed, Gabriel Frey and Christian J. Michel*
Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France
Three sets of 20 trinucleotides are preferentially associated with the reading frames and their 2 shifted frames of both eukaryotic and prokaryotic genes. These 3 sets are circular codes. They allow retrieval of any frame in genes (containing these circular code words), locally anywhere in the 3 frames and in particular without start codons in the reading frame, and automatically with the reading of a few nucleotides. The circular code in the reading frame, noted X, which can deduce the 2 other circular codes in the shifted frames by permutation, is the information used for analysing frameshift genes, i. e. genes with a change of reading frame during translation. This work studies the circular code signal around their frameshift sites. Two scoring methods are developed, a function P based on this code X and a function Q based both on this code X and the 4 trinucleotides with identical nucleotides. They detect a significant correlation between the code X and the −1 frameshift signals in both eukaryotic and prokaryotic genes, and the +1 frameshift signals in eukaryotic genes.