Biobase Biological Databases GmbH,
Halchtersche Straße 33,
D-38304 Wolfenbüttel
e-mail:
1elf@biobase.de,
2sla@biobase.de,
3sro@biobase.de,
4dka@biobase.de,
5ewi@biobase.de
Thus far, several genomes, including the human genome, have been completely sequenced and the start of the "post-genomic era" or the period of "functional genomics" is now proclaimed. Systematic elucidation of gene function requires to link sequence data with information about molecular mechanisms, and also with histological, anatomical and even taxonomical data. As a consequence, even "classical" branches of biological and medical research gain interest by molecular biologists when linked to genome-based information.
Although the discussion about the definition of ontologies is still going on we call CYTOMER an ontology because it is a concise description of principal, relevant entities and their potential relations to each other. CYTOMER is maintained as a relational database system which is aiming at providing a comprehensive overview on all gene expression sources, focussing thus far on human entities. Gene expression sources are organs, tissues and cell types in the different developmental stages of an organism. Therefore CYTOMER is a database of physiological systems (table system ), developmental stages (tables stage and period), anatomical structures and substructures (table organ) and the constituting cell types (table cell) in different organisms or species (table species). The central table of CYTOMER is HUB which is a list that links entries of the five other tables. The HUB-table represents anatomical / histological knowledge about which cells occur with what kind of function in which organs, at what stages and in which species. (Fig. 1)
|
Figure 1: Simplified structure of the relational database schema of Cytomer. |
The CYTOMER database is applied to map expression patterns of genes and gene products. Up to now it is mainly used to represent expression patterns of transcripiton factors as they are given in the TRANSFAC database. Therefore entries of the HUB-table habe been linked with human transcription factor entries in the TRANSFAC factor table: 1) CP (cell-specific-positive)-column for those expression sources where a certain factor has been shown to be expressed in and 2) CN (cell-specific-negative) column for those expression sources where no evidence of a certain factor has been published.
CYTOMER is first of all a TRANSFAC-complementing module which enables proper representation of expression patterns of transcription factors. However, CYTOMER is going to be extended as an independent database system which provides customers with specific aims.
We supplemented CYTOMER which was previously restricted to human and mouse, with data of the nematode Caenorhabditis elegans, which is a well characterized species with the entire genome sequence known.
The most extensive tables of CYTOMER are the tables organ and cell. The organ-table is itself hierarchically organized and is representing an ontology of anatomical structures and substructures. There are English terms, synonyms, the medical terminology and the anatomical parents. Definitions of anatomical structures are given in German and English. The cell -table includes an international, an English and a German cell name as well as synonyms and the cell parents. Furthermore there are short descriptions of location and cell function. For instance, lung belongs to the respiratory system together with nose, larynx, trachea and bronchial tree. The respiratory system belongs to the physiological system table. Fig. 2. shows how some anatomical structures of the lung are hierarchically represented in the database.
|
Figure 2: Structural hierarchy of lung morphology and its representation in the Cytomer® ontology. |
Until now, the organ-table hierarchy of the human adult is completed. There are 8400 entries of adult anatomical structures and substructures. The anatomical hierarchy depth is from 1 to 11. There are 2079 inner nodes (including 80 beginning nodes, i. e. "primary organs") and 6321 end nodes. The organ table will be made freely available in near future on the world wide web under www.biobase.de
The authors are indebted to Xin Chen, AG Bioinformatics/GBF and Peking University, for her help in the initial stages of defining the Cytomer database structure and first data contents.