Institut für Biotechnologie 2, Forschungszentrum Jülich,
52425 Jülich,
Email: j.hurlebaus@fz-juelich.de,
r.takors@fz-juelich.de
1Abt. Simulationstechnik und Informatik, Institut für Mechanik und
Regelungstechnik, FB 11,
Universität Siegen,
Paul-Bonatz-Str. 9-11,
57068 Siegen,
Email: wiechert@simtec.mb.uni-siegen.de
Identification of metabolic regulation is a key point in metabolic engineering. Metabolic regulation phenomena depend on intracellular compounds such as enzymes, metabolites, nucleotides and cofactors. A complete understanding of metabolic regulation requires quantitative information about these compounds under in vivo conditions [2, 11]. This quantitative knowledge in combination with the known network of metabolic pathways allows the construction of mathematical models that describe the dynamic changes in metabolite concentrations over time. The models are high-dimensional systems of ordinary, non-linear differential equations. The main problems of the approach are the setup of the equations that describe the metabolic pathways in form of kinetic rate equations and the parameter identification of the system parameters. To solve these problems, a variety of pathway modeling software has been developed that simplifies model construction and analysis. At the Institute of Biotechnology a metabolic modeling tool (MMT) software for pathway analysis that is based on a relational database has been developed [6]. The software can be used to define and analyze dynamic pathway models based on non-linear kinetic rate equations. MMT differs from other available pathway modeling tools by integrating an efficient model storage management (relational database), a structured pathway overview, simulation and parameter identification algorithms and graphical output of results in one software tool. The software was developed specifically to analyze data from fast sampling experiments conducted at the Institute of Biotechnology.
Figure 1 summarizes the structure of a network of metabolic pathways and the corresponding mathematical formulation. The metabolic network is represented with a bipartite graph where the letters A to D represent compounds as e.g. metabolites or nucleotides. The letters u, v, p and q represent reaction rates of the network, i.e. reaction rate u represents the conversion from A+D to B. The reaction rate u is inhibited by compound C as indicated in the graph. Possible mathematical formulations for the reaction rates u and v are also shown in the figure. The system of differential equations for the network of metabolic pathways results from the flux balance equations.
Metabolic regulation phenomena are assumed to result mainly from enzyme activities. These activities determine the kinetic rates u, v, p and q described in Figure 1. The enzyme activities are not constant but influenced by a large number of compounds present inside a cell. A kinetic rate equation must include such effects in order to provide correct representation of the dynamics of the pathway. In Figure 1, the rate u depends on the compounds A and D and additionally on the compound C that has an inhibitory effect. A correct setup of rate equations requires the knowledge of regulation phenomena which is still very limited. The problem results from the fact that enzyme activities are mainly analyzed in vitro using a small number of compounds [1]. Under in vivo situations, where a large number of different compounds is present, different effects may be important for the enzyme activity, and these effects are usually unknown.
Due to the lack of knowledge about the exact form of the rate equations it is not possible to construct one 'correct' complete biochemical pathway model at this time. Instead, a large number of variants of a model with different kinetic reaction rates, different assumptions on the effectors, and even different pathways has to be constructed and analyzed. Analysis includes test of the ability of a model to describe experimental data. This requires parameter identification algorithms and intracellular measurements from experiments. To support this approach, several rapid sampling experiment using Escherichia coli K12 colonies have been performed at the Institute of Biotechnology to identify in vivo enzyme kinetics. Up to 30 in vivo metabolite and nucleotide concentrations have been measured with a frequency of 4 Hz for a time frame of 45 seconds. During the measurements, a pulse of substrate has been performed to analyze cellular response to substrate availability after a long period of substrate limitation. This data provides the background for analysis of the constructed models and identification of the most realistic model.
The MMT software for building a large number of variants of complex pathway models has been developed and used for analysis of data from the described rapid sampling experiments. MMT provides a structured overview of the metabolic network (Figure 2) and includes simulation and parameter identification algorithms. Many of the available pathway modeling tools do not provide such a convenient overview (e.g. KINSIM/FITISM [3], GEPASI [8], DanyFit [7]), are severely limited in dimension (e.g. MetaTool [9], DBsolve [4]) or do not include parameter identification algorithms (e.g. SCAMP/JARNAC [11], E-cell [12]). A main difference between MMT and all other pathway modeling tools known to the authors at this time is the storage of all model information in a relational database. This database stores information on three levels:
|
Figure 2: Screen shot of
the main window of the MMT software. The left part of the window shows
chemical reactions sorted by locations (e.g. cytoplasm) and reaction groups
(e.g. pathway glycolysis). The right part of the window shows details of a
chosen reaction. |
The MMT software as a database front-end allows to define and analyze models quickly by making use of data from a database with previously defined and analyzed models. It translates the pathway model in a system of ordinary differential equations that is exported as a C-code using MAPLE 6. Additionally it calculates the sensitivity functions, i.e. the derivatives of the equations with respect to parameters and initial conditions. These functions are then also available as C-code.
The MMT software provides simulation (LSODE [5]) and parameter identification (Nelder-Mead based SUBPLEX [10]) algorithms. These algorithms are C- and FORTRAN-codes that are created and executed by MMT. They make use of the C-code with the system of ordinary differential equations that is exported from the MMT software. Results are stored in the database and can be viewed and printed. Identification of 'good' models through model discrimination is now based e.g. on fitting quality (agreement of simulation output and experimental data), parameter identifiability and parameter sensitivities.