In Silico Biology 2, 0009 (2002); ©2002, Bioinformation Systems e.V.  
Dagstuhl Seminar "Functional Genomics"

GeneNet database: Description and modeling of gene networks

Nikolay A. Kolchanov, Eugenia A. Nedosekina, Elena A. Ananko, Vitaly A. Likhoshvai, Nikolay L. Podkolodny, Alexander V. Ratushny, Irina L. Stepanenko, Olga A. Podkolodnaya, Elena V. Ignatieva and Yury G. Matushkin




Institute of Cytology and Genetics SB RAS,
Novosibirsk, Russia





Edited by E. Wingender; received November 15, 2001; revised and accepted December 11, 2001; published February 27, 2002


Abstract

Almost all cellular processes in an organism are controlled by gene networks. Here we report on the analysis of gene networks functioning using two associated methods - data accumulation in GeneNet system and generalized chemical kinetic method for mathematical simulation of gene network functional dynamics. The technology of the usage of these methods is shown on the example of the gene network of macrophage activation.

Keywords: gene network, modeling, mathematical simulation, database designing, macrophage.



Introduction

As known, all cellular processes including cell metabolism, cell division and differentiation, as well as functioning of different organs are genetically determined and controlled by gene networks.

The history of theoretical studies of gene networks is considered to start in the 1960-ties. In his pioneering work, Ratner described the general features of organization of molecular genetical systems that control functioning of prokaryotes [Ratner, 1966]. Another study was devoted to studying the dynamics of gene network functioning by means of the simplest logical schemes [Kauffman, 1969]. By now, many approaches of gene network studying have been developed, including stochastic models [McAdams and Arkin, 1997], Petri nets [Hofestädt and Meineke, 1995], Boolean networks [Sanchez et al., 1997], the logical approach [Thieffry and Thomas, 1995; Sanchez et al., 1997], threshold models [Tchuraev, 1991], and approaches based on differential equations [Thomas, 1973; Savageau, 1985; Likhoshvai et al., 2000].

Theoretical studying of gene networks was accelerated in 1990-ties, when a large amount of experimental data, necessary for theoretical studies and devoted to functioning of biological objects, had been accumulated. In this connection, it became necessary to develop databases, both general onesas well as specialized databases for different organisms, processes, and information content. In the present article, we describe a module used for studying of gene networks, i.e., the GeneNet database, which is part of the GeneNetWorks system. The GeneNet database is available via the Internet (http://wwwmgs.bionet.nsc.ru/systems/mgl/genenet/) and it is used for accumulation of biological information on organization and functioning of gene networks. We also demonstrate the method of mathematical simulation of biological systems and processes by using a generalized chemical kinetic approach to model macrophage activation as an example.



Methods

1. Technology for describing gene networks

a) Object-oriented approach

For the description of gene networks we apply an object-oriented approach in which we divide all components into two types: elementary structures and elementary events.

By "elementary structure" we denote genes, RNAs, proteins, and non-proteinaceous substances. If necessary, some other components may be added to those listed above. Each class of objects is described by means of a specialized format, so for the description of elementary structures the database contains the following tables: GENE, RNA, PROTEIN and SUBSTANCE.

Elementary events are divided into two types: reactions and regulatory events. By "reactions", we mean interactions between substances causing synthesis of the novel substances. As "regulatory event", we define any event which influences a reaction. In the GeneNet database, the reactions are classified into direct and indirect reactions, whereas regulatory events are represented by "switching on", "switching off", "positive effect", and "negative effect".


Elementary structures.   As indicated above, the elementary structures of gene networks in the GeneNet database are described in the tables GENE, PROTEIN, RNA and SUBSTANCE. For description of elementary structures, we use information about the brief and complete name of a substance, about the organism and cell, where this substance is present, as well as information about the literature source, etc.

An example of an entry is shown in Table 1. It indicates that murine protein IP-10 is a IFN--inducible protein (NM); it has an active state (FN); it is extracted from macrophages (SO); etc.

Table 1: Formalized description of the mouse protein IP-10 in the table PROTEIN of the GeneNet database.

Information field Field description Type of information
ID Mm:IP-10 Identifier in the database
DT 27.04.2001; Nedosekina E.A.; created.11.10.2001; Nedosekina E.A.; updated. Information about creation and editing of the entry
OS Mus musculus (mouse). Species name
SN IP-10 Brief protein name
NM IFN-gamma-inducible protein Complete protein name
FN Active Functional state
MM no data Information about multimer organization of a protein
MD no data Information about post-translational modification of a protein (e.g., phosphorylation)
GN Mm:IP-10 Identifier of corresponding gene
CC 10 kDa Textual comments
SO Mm:macrophage Name of a cell line used in an experiment
RF Kopydlowski K.M. et al., 1999 Reference to the literature source


Elementary events.   For description of elementary events, information about the type of reaction (direct or indirect) and regulatory event (switch on, switch off, enhancement of reaction or its attenuation, etc.), literature source, etc. is used.

An example of description of an elementary event within the gene network of macrophage activation is shown in Table 2. The PTS enzyme (6-pyruvoyl tetrahydropterin synthase) catalyzes the reaction of DHNP-3p (7,8-dehydroneopterin-3'triphosphate) transformation to 6-PTHP (6-pyruvoyl tetrahydropterin). This regulatory event influences a direct reaction and the enzyme switches on the reaction of transformation.



Table 2: Formalized description of enzyme reaction catalyzed by PTS enzyme in the table RELATION of the GeneNet database.

Information field Field description Type of information
ID <protein>Hs:PTS^cytoplasm ->>
<substance>DHNP-3p^cytoplasm ->
<substance>6-PTHP^cytoplasm
Identifier of relation in the database
DT 27.04.2001; Nedosekina E.A.; created.11.10.2001; Nedosekina E.A.; updated. Information about creation and editing of the entry
AT switch on Type of regulatory action
EF direct Reaction type
RF Kopydlowski K.M. et al., 1999 Reference to the literature source


b)  Levels of gene network representation

For an adequate description of gene networks, it is necessary to represent them on different levels of organization. In the GeneNet system, the gene networks may be described on the levels of an organism, cell, and a gene [Kolchanov et al, 2002].

Organism level. Among the entities described on this level are the organs, tissues, particular types of cells, and secreted proteins and substances affecting other organs, tissues, and cells. On this level, the spatial distribution of gene network components in the organism may be described.

Cell level. The entities described on this level include different cell compartments (for example, cytoplasm, nucleus, and mitochondria), proteins, RNAs, genes, and substances). On this level, distribution of the gene network components among cell compartments is described.

Gene level. On this level, regulation of gene expression is described in details, employing the information from the TRRD database [Kolchanov et al., 2000].


c)  Data input and visualization of gene network structure

For browsing the gene networks, a special program GeneNet Viewer was designed [Kolpakov et al., 1998]. This program provides visualization of information. All components of these gene networks may be accessed interactively. Information about a particular component may be extracted by clicking on this component with the mouse.

The used representation of the gene network components is shown in Figure 1.

Figure 1: Denotations of some components of a gene network in the GeneNet system.


d)  Information content of the GeneNet system

Gene networks represented in the GeneNet system may be classified into the following groups in accordance with the function they perform in an organism: lipid metabolism, endocrine regulation, erythrocyte maturation, immune system, heat shock response, redox-regulation and plant gene networks [Podkolodnaya et al., 2000; Ananko et al., 2000; Logvinenko et al., 2000; Stepanenko et al., 2000a; Stepanenko et al., 2000b; Goryachkovsky et al., 2000a; Goryachkovsky et al., 2000b; Axenovich et al., 2000]. The names of these gene networks and amounts of components they contain are listed in Table 3. In the current version of the GeneNet system, there are 25 gene networks, which may be arbitrarily subdivided into the following groups [Kolchanov et al, 2002]:

  1. Gene networks regulating cell growth and differentiation, morphogenesis of tissues and organs, growth and development of the organism;

  2. Gene networks maintaining homeostasis of biochemical and physiological parameters of the organism; and

  3. Gene networks providing response of the organism to the changes of the environment, for example, stress response.

  4. Additionally, a group of gene networks that are under construction now, should be also mentioned here:

  5. Gene networks controlling cyclic processes, for example, cell cycle, cycle of heart muscle contraction, etc.

In the following, we exemplarily consider organization and functioning of gene networks of macrophage activation in detail.

Table 3: Informational content of the GeneNet database (by August, 1, 2001)

GeneNet section Entry name in GN_SCHEME Number of components
Genes Proteins Relationships
Lipid metabolism Cholesterol 6 11 36
Cholesterol_MODEL 5 18 43
Leptin (organism level) 43 19 89
Endocrine regulation Principal cell of CCD 3 15 34
Steroidogenesis (adrenal cortex) 15 39 80
Steroidogenesis (sex steroids) 12 41 78
Thyroid system 23 66 110
Erythrocyte maturation Erythroid differentiation 41 51 98
Immune system Antiviral response 12 51 53
Macrophage activation (model) 37 70 124
Heat shock response HSP70 autoregulation 6 19 37
Heat shock response 36 41 112
Thermotolerance 4 40 64
Redox-regulation REDOX-regulation 48 43 111
Degradation of storages under seed germination Germination (endosperm) 5 21 25
LEA program 13 32 27
Seed reserve mobilization
(1): carbohydrates
7 7 34
Seed reserve mobilization
(2): lipids and phosphates
5 8 31
Seed reserve mobilization
(3): proteins
5 11 42
Seed reserve mobilization
(4): regulatory relationships
12 22 62
Seed reserve mobilization
(5): general diagram
11 27 59
Seed reserve mobilization
(organism level)
7 23 46
Biosynthesis of storages in the process of seed maturation Storage protein biosynthesis (dicot and monocot) 22 56 64
Plant cell response on infection by pathogens Plant-pathogen 31 34 65




2. Gene network dynamics simulation

a) Mathematical simulation of gene network functional dynamics

Gene networks are complex objects for modeling since they contain a large number of components and interactions. Therefore, special methods are required to handle such large systems. Also, it is a characteristic of gene networks that there exist different alternate hypotheses for most of them, a fact that should also be taken into account. All these requirements are satisfied by the method of generalized chemical kinetical modeling developed earlier [Bazhan et al., 1995; Belova et al., 1995; Likhoshvai et al., 2000; Ratushny et al., 2000].


b) Generalized chemical kinetic method: general description

The generalized chemical kinetic simulation method (GCKSM) is constructed according to the principle of blocks. This principle implies that the object under study is subsequently divided into simpler subsystems. As a result, the final number of elementary processes and structural elements is determined . Within the frames of this approach we a process as an elementary one if for its description it is sufficient to use only the parameters and concentrations of the components of the process considered. Thus, regulatory interactions and reactions described in gene networks may be considered as elementary processes. The structural elements are genes, mRNA, protein molecules, and low-molecular substances. In GCKSM approach, a model for the gene network is considered as the integrity of elementary processes.

Each elementary event is described separately by means of formal blocks. The formal block is characterized by the set of dynamic variables , formal parameters , and the law of rearrangement of information . The description of an elementary process consists of assigning a particular biological meaning to the variables and the parameters of the formal block, which describes the process [Kolchanov et al, 2002].

In the general case, the generalized chemical kinetic approach has no special limitations for the mode of description of elementary blocks. This mode is chosen in accordance with the tasks and goals of the study. The laws of processing the information may be expressed by systems of differential equations, given in discrete, probabilistic, stochastic, or combined terms [Likhoshvai et al, 2001]. The mathematical apparatus used for modeling dynamics of functioning of Gene network on macrophage activation consists of formal blocks based on the following differential equations:


1) Reversible bimolecular reaction:
    


2) Irreversible monomolecular reaction:
    


3) Constitutive synthesis:
    


4) Enzyme reaction:
,where .
    e=0,1.


In this case, all structural elements of a gene network (genes, RNA, proteins, inorganic substances, and complexes) are constituent parts of the mathematical model as unique variables, which are unambiguously characterized by the assigned names. Descriptions of all known intracellular processes in which these components act (i.e., synthesis, execution of some function, secretion out from the cell, degradation et al.; see the particular example in the section "Mathematical model of macrophage activation"), enable us to construct unambiguously the system of differential equations for these events by applying the rule of summation of elementary reaction rates in elementary blocks (Fig.2). This system describes the laws of time dependence of the concentration rates of the gene network's components. The setting of initial concentrations of the gene network elements enables to calculate the state of this network at any subsequent time period.


Figure 2: The law of summing up the rates of the processes.




Results and discussion

a) Gene network of macrophage activation

Macrophages belong to the key components of the immune system. To support normal functioning of an organism and its defense from the effects of a wide range of harmful agents, these cells perform functions such as phagocytosis, support and regulation of the immune response, regeneration of disrupted tissues, etc.

LPS (lipopolysaccharide) and IFN- (interferon-gamma) are most frequently used as activating agents when investigating the processes caused by macrophage activation. These agents were chosen mostly due to the fact that under natural conditions they frequently activate macrophages. LPS is the main component of the external membrane in the Gram-negative bacteria. IFN- is a cytokine secreted by T-lymphocytes in response to penetration of infection. Complete activation of macrophages, which supports full immune response, takes place under simultaneous activation by both agents mentioned above.

Activated macrophages, except morphological alterations, are characterized by enhanced phagocytosis followed by synthesis of a particular set of cytokines: IL-1, IL-6, IL-10, IL-12, IFN-, IFN-;, TNF-. These cytokines influence different cells including cells of immune system, fibroblasts, thrombocytes, platelets, etc. Among other substances, the synthesis of which is enhanced during macrophage activation, are active radicals (NO., H2O2, O2.¯); enzymes (cyclooxygenase-2, inducible nitroxide synthase, carboxylesterase, lysozyme, acid phosphatase and others); proteins expressed on the cell surface (antigens of histocompatibility complex, immunoglobulin receptors, molecules of cell adhesion, etc.), and some other substances.

Thus, during macrophage activation, transcription of many genes is activated. Among them are the genes of iNOS, IRF-1, IRF-2, ICAM1, IP-10, ICSBP, Fc-RI, mig, GBP-1, CIITA, IFN-, IL-6, IL-1, IL-12p40, MIP-1, COX-2, TNF-, GM-CSF, etc.

Let us take a closer look at the synthesis of one of the components of the gene network of macrophage activation, nitric oxide (NO).

This synthesis is catalyzed by the enzyme nitroxide synthase (iNOS). The iNOS gene is normally expressed in macrophages at a low level, but its expression increases considerably under the action of different agents including interferons, TNF-, IL-1. Nitroxide synthase in macrophages catalyses the synthesis of NO, which causes cytotoxic and protective effects. Similar to other NO synthases, iNOS needs L-arginine, molecular oxygen, and NADPH as the substrates as well as co-factors tetrahydrobiopterin, FAD (flavin adenin dinucleotide) and FMN (flavin mononucleotide) [Marletta et al., 1993].

The co-factor tetrahydrobiopterin (H4B) is synthesized out of GTP in three stages (Fig. 3). The enzyme GCH1 (GTP cyclohydrolase I) that catalyzes the first stage (modification of GTP into 7,8-dihydroneopterin-3'-triphosphate, DHNP-3p) should be initially activated. It was found that its activation takes place after the action of IFN- on the cell [Werner et al., 1990]. The next reaction (turnover of DHNP-3p into 6-pyruvoyl tetrahydropterin, 6-PTHP) is produced by the enzyme PTS (6-pyruvoyl tetrahydropterin synthase). At the last stage, SPR (sepiapterin reductase) works. This enzyme finishes the synthesis of the tetrahydrobiopterin [Werner et al., 1990].

Figure 3: A fragment of the gene network of macrophage activation which represents the NO synthesis in a macrophage.

As the result of action of the inducers LPS and IFN-, the induction of synthesis of a large number of gene products takes place in a macrophage, so that this cell may perform its functions completely.


b) Mathematical model of macrophage activation

On the basis of the scheme of the gene network constructed by the generalized chemical kinetical approach of modeling, we developed a mathematical model of macrophage activation. This model includes 245 reactions and 167 dynamic variables. The scheme of the gene network underlying the mathematical model is available via the Internet by the address:
http://wwwmgs.bionet.nsc.ru/systems/MGL/GeneNet/. As was mentioned previously (see section "Generalized chemical kinetic method: general description"), each elementary process is described by a formal block. For example, phosphorylation of p38 by MAP kinase kinase (Fig. 4A) in the mathematical model is described by the following system of equations:

   dA/dt=dB/dt=-dC/dt=-dD/dt=-k0*A*B/(Km+B),

where A corresponds to activated MAP kinase kinase; B - p38, C - phosphorylated form of p38, D - inactive form of MAP kinase kinase, k0 - turn-over constant, Km - Michaelis constant.

Another elementary process, secretion of the protein IP-10 out of the cell, is given as follows (Fig. 4B):

   dE/dt=-dF/dt=-k*E,

where E is IP-10 protein in a cell, F is IP-10 protein in a medium, k is reaction constant.

Figure 4: Fragments of the gene network on macrophage activation. A is phosphorylation of p38 by MAP kinase kinase, B is secretion of the IP-10 protein.

The object of modeling is a cell, so all concentrations of the gene network components were taken as necessary for a single cell.

After developing the mathematical model of the gene network of macrophage activation, we have performed some calculations to estimate quantitative data about functioning of macrophages in the stationary state, prior to activation. In these calculations, the concentrations and constants of the network components were chosen so that the results obtained were optimally fitting with experimental data on the respective concentrations.

Then, on the basis of the stationary points obtained, we have made an assortment of the model parameters which supported adequate dynamics of some components when macrophages are activated by LPS and IFN-. As a criterion of adequacy, we used the data obtained in some experimental studies.

For example, we have made a comparison of the data obtained in an experimental study [Chen et al., 1995] with those estimated by the mathematical model. These data have revealed the dependency of the concentration of nitric oxide (NO) synthesized in a cell upon LPS concentration (Fig. 5). As mentioned above, NO is one of the basic substances synthesized by activated macrophages.

In biochemical experiments, LPS was added in different concentrations with subsequent measurement of NO yield after 6 hours. As can be seen in the plot (Fig. 5), the optimal concentration of LPS in the experiment was 10 g/ml. Further growth of LPS concentrationpractically does not influence the output of NO.

Similar results were obtained by the mathematical model. When the LPS concentration is lower than 1.25 g/ml, we observed a discrepancy between the calculated and experimental data. This fact could be explained by different values of NO concentrations in the control of biochemical experiments and NO concentration adopted for the stationary state of a cell in mathematical model (the concentration was taken as an average value measured in several experiments reported in [Chen et al., 1995]).

Figure 5: Comparison of data obtained by mathematical modeling and in the experiment [Chen et al., 1995]. These data estimate the value of synthesized NO in dependence upon LPS concentration. NO concentration is measured as the number of molecules per cell (unit/cell). LPS concentration is measured as the microgram per milliliter (g/ml). A, experimental data [Chen et al., 1995]; B, data obtained by mathematical modeling.


The rate of NO synthesis depends upon concentration of the receptor that binds to activator (i.e., CD14, in case of LPS). We performed mathematical modeling of the influence of a mutation which increases the concentration of this receptor by an order of magnitude (for example, through enlargement of the half-life period of this protein or through transcription activation of the respective gene). This mutation leads to considerable growth in the yield of nitric oxide (Fig. 6), which, in turn, gives rise to different pathologies.

Figure 6: Effect of a conditional mutation, which enlarges concentration of the CD14 receptor, onto concentration of NO synthesized by a cell in response to LPS. NO concentration is measured as the number of molecules per cell (unit/cell). A, normal cell state according to [Chen et al., 1995]; B, normal cell state estimated by the mathematical model; C, influence of a mutation, evaluated by the mathematical model.

The effect of this mutation may be compensated by injecting a substance into the media (or an organism) that binds the excess of receptor (Fig. 7). Some parameters of this hypothetical substance, such as the constant of binding to the receptor or its concentration, may be calculated by the mathematical model.

Figure 7: Dynamics of NO synthesis under induction by LPS. The compensatory effect of the substance that binds the excess of CD14 receptor if a mutation caused enhancement of concentration of this receptor. NO concentration is measured as the number of molecules per cell (unit/cell). A, the normal cell state estimated by the data from [Chen et. al., 1995]; B, the action of the mutation evaluated in accordance with the mathematical model.


The mathematical model that we have constructed reflects the process of macrophage activation under the action of LPS and IFN-. By using this mathematical model, it is possible to predict quantitative and qualitative characteristics of gene network functioning in the normal state, as well as to estimate the impact of mutations, administration of drugs with the known pattern of action onto the other components and processes of a gene network.



Conclusions

Gene networks are complex molecular-genetic objects. The principles that determine the choice of methods and technologies which are necessary for the construction and analysis of gene networks are:

  1. Gene networks contain many (from dozens to many thousands) molecular components and relations between them;
  2. Currently, active accumulation of information about functioning of biological objects is in progress, whereas novel links are steadily being discovered;
  3. For an adequate mathematical simulation it is necessary to create a single model which allows to consider different variants of a certain process which provide alternative hypotheses about the functioning of the gene network.

Thus, the methods used for the analysis of gene networks should allow to describe a lot of diverse information and to adapt the gene network to newly appearing data. For mathematical modeling, it is important to have a possibility to store different variants of the considered gene networks. Due to this necessity, we have used an object-oriented approach for description of gene networks in the GeneNet system. For mathematical simulation, we have applied the generalized chemical kinetical method of modeling which was developed to describe kinetics of the processes considered and which is based on the analysis of elementary structures.

Using the mathematical models of gene networks that were developed, it is possible to study the behavior of a system in its normal state, as well as to study its various disruptions, pathologies and to suggest possible solutions of these problems.

As a next step, we plan to develop some other gene networks and new mathematical models. The GeneNet Viewer is under development now as well as the interface for on-line developing gene net models. This will allow a wide range of specialists to use our method developed for gene network mathematical modeling.



References