TRANSCompel® - a professional database on composite regulatory elements in eukaryotic genes.

Olga V. Kel-Margoulis12*, Igor V. Deineko2, Ingmar Reuter1, Edgar Wingender1,3 and Alexander E. Kel12




1BIOBASE GmbH,
Halchtersche Strasse 33,
D-38304 Wolfenbuettel, Germany;
2Institute of Cytology & Genetics SB RAN,
10 Lavrentyev pr., 630090,
Novosibirsk, Russia;
3 Research Group Bioinformatics, Gesellschaft fuer Biotechnologische Forschung mbH,
Mascheroder Weg 1,
D-38124 Braunschweig, Germany.
*To whom correspondence should be addressed.
Present address: BIOBASE GmbH,
Halchtersche Strasse 33,
D-38304 Wolfenbuettel, Germany,
Phone: 05331-858426
Fax: 05331-858470
Email: oke@biobase.de






Taking the origin from COMPEL [Kel-Margoulis et al., 2000a,b], the TRANSCompel® database emphasizes the key role of specific interactions between transcription factors binding to their target sites providing specific features of gene regulation in a particular cellular content.

Based on the known examples we define a composite element as a minimal functional unit where both protein-DNA and protein-protein interactions contribute to a highly specific pattern of gene transcriptional regulation [Kel O.V. et al., 1995; Kel O.V. et al., 1997; Kel-Margoulis et al., 2000a]. Thus interacting factors may differ by the structure of DNA-binding, activation, oligomerization and other domains. Along with structural differences, functional properties of the transcription factors and hence their specific contribution to the transcription regulation may significantly vary. Co-operative action of the transcription factors within the composite elements results in a new highly specific pattern of gene transcription that can not be provided by factors separately. Composite elements are structural-functional units that provide cross-coupling of gene regulatory pathways, and in particular, cross-coupling of signal transduction pathways [Kel-Margoulis et al., 2000a].

There are two main types of composite elements: synergistic and antagonistic ones. In synergistic CEs, simultaneous interactions of two factors with closely situated target sites results in a non-additive high level of a transcriptional activation. Within an antagonistic CE two factors interfere with each other.

During the last year, the content of the database is enlarged by approximately 30% in comparison with COMPEL 3.0 [Kel-Margoulis et al., 2000a; Kel-Margoulis et al., 2000b]. Three versions of the database have been released: public version 4.4 and professional versions 5.1 and 5.2 (Table 1).


Table 1: Content of the current public and professional versions of the TRANSCompel®.
 Number of entries
4.4 -public 5.2 - professional
Composite elements 202 256
Genes 131 162
Links to EMBL 171 216
Transcription factors
linked to the TRANSFAC
171 216
Interactions 639 948
Evidences 602 846
References 207 281

Among recent entries there are CE's containing binding sites for the following transcription factors: Smads - 14 entries, Steroidogenic Factor 1 - 11 entries, SREBP - 8 entries, AML/PEBP - 10 entries, PU.1 - 19 entries, c-Ets-1,2 - 39 entries.

Being maintained internally as a relational database, TRANSCompel® is distributed as a single ASCII flat file. Public version 4.4 is available at http://www.gene-regulation.com/pub/databases.html#transcompel; the recent professional version 5.2 can be obtained from BIOBASE (http://www.biobase.de). Release COMPEL 3.0 can be found at http://compel.bionet.nsc.ru/. A detailed description of the fields is given in the database documentation. WWW-based search and browse options are available.

Classification of the composite elements.   We have classified CE's according to the specific transcriptional regulation they provide due to co-operative action of transcriptional factors binding to their target sites. Based on the TRANSCompel® 5.2, 187 CE's have been classified (Table 2). The majority of CE's contain at least one binding site for an inducible factor (140 CE's), and a number of CE's contain at least one binding site for a tissue-enriched factor (77 CE's). CE's might be classified into five main groups (Table 2): 1) 76 CE's formed by binding sites for two inducible factors providing cross-coupling of signal transduction pathways; 2) 37 CE's formed by binding sites for a tissue-enriched and an inducible factor providing tissue-specific responses to inducing signals; 3) 25 CE's formed by binding sites for a tissue-enriched and a constitutive ubiquitous factor providing some additional features of the tissue-specific transcriptional regulation; 4) 22 CE's formed by binding sites for an inducible and a constitutive ubiquitous factor providing some additional features of the inducible regulation; 5) 15 CE's formed by binding sites for two tissue-enriched factors providing some particular tissue-specific regulation.


Table 2: Functional classification of CE's.
Tissue-restricted 15
Inducible 37 76
Cell cycle-dependent   1 2
Developmental stage-dependent  4   1
Ubiquitous constitutive 25 22 1 1 2
F1
       F2
Tissue-restricted Inducible Cell cycle-dependent Developmental stage-dependent Ubiquitous constitutive

Connected program.   Program CATCHTM for searching potential composite elements in DNA sequences has been developed. It is publicly available at http://compel.bionet.nsc.ru/FunSite/CompelPatternSearch.html. A sequence under study is scanned by this program using all composite elements collected in the TRANSCompel as individual search patterns. Several parameters are available restricting the search: maximal mismatches in the cores of site1 and site2 comprising the composite elements, maximal variation of the distance between two sites, and composite score cut-off value [Kel-Margoulis O.V. et al., 1998]. The composite score reflects how well the match coincides with the known examples of the composite element in the TRANSCompel. This scoring function takes into account the number of mismatches in both sites and the distance between them. All found matches are directly linked to the TRANSCompel entries containing the corresponding composite elements.



ACKNOWLEDGEMENTS

Different parts of this work were funded by the German Bundesministerium fuer Bildung, Wissenschaft, Forschung und Technologie (FANGREB project and project no. X224.6), by the Russian Ministry of Sciences and the Siberian Branch of Russian Academy of Sciences, by the North Atlantic Treaty Organisation (grant no. 951149), by Volkswagen-Stiftung (I/75941) as well as by BIOBASE GmbH (Wolfenbuettel, Germany).


REFERENCES

  1. Kel,O.V., Romaschenko,A.G., Kel,A.E., Wingender,E., and Kolchanov,N.A. (1995) Nucleic Acids Res., 23, 4097-4103.
  2. Kel,O.V., Romaschenko,A.G., Kel,A.E., Wingender,E., and Kolchanov,N.A. (1997) Mol. Biol. (Mosk)., 31, 498-512.
  3. Kel-Margoulis,O.V., Kel,A.E., Frisch,M., Romaschenko,A.G., Kolchanov,N.A., and Wingender,E. (1998) Proceedings of the First International Conference on Bioinformatics of Genome Regulation and Structure, (BGRS'98), ICG, Novosibirsk, Vol.1, 54-57.
  4. Kel-Margoulis,O.V., Romaschenko,A.G., Kolchanov,N.A., Wingender,E., and Kel,A.E. (2000) Nucleic Acids Res., 28, 311-315.
  5. Kel-Margoulis,O.V., Romaschenko,A.G., Deineko,I.V., Kolchanov,N.A., Wingender,E., and Kel,A.E. (2000) Proceedings of the Second International Conference on Bioinformatics of Genome Regulation and Structure, (BGRS'2000), ICG, Novosibirsk, Vol.1, 45-48.