In Silico Biology 7 S1, 10 (2007); ©2007, Bioinformation Systems e.V.  

Workshop "Storage and Annotation of Reaction Kinetics Data"
May 2007, Heidelberg, Germany


Integration of CellDesigner and SABIO-RK


Akira Funahashi1,2,3,4*, Akiya Jouraku1,5, Yukiko Matsuoka1,4 and Hiroaki Kitano1,3,4,6




1 JST/ERATO-SORST Kitano Symbiotic Systems Project, Japan
2 School of Medicine, Keio University, Japan
3 School of Fundamental Science and Technology, Keio University, Japan
4 The Systems Biology Institute, Japan
5 School of Science for Open and Environmental Systems, Keio University, Japan
6 Sony Computer Science Laboratories, Inc., Japan



* Corresponding author

   Email: funa@symbio.jst.go.jp





Edited by I. Rojas and U. Wittig (guest editors); received and accepted March 21, 2007; published March 28, 2007



Abstract

Understanding the logic and dynamics of gene-regulatory and biochemical networks is a major challenge for systems biology. To facilitate this research topic, we have developed CellDesigner to visualize, model and simulate biochemical networks. CellDesigner allows the users to easily create networks using solidly defined and comprehensive graphical notation. CellDesigner utilizes SBML to described models and can simulate models using an integrated SBML ODE Solver or third party simulation engine; thus enabling users to simulate through a sophisticated graphical user interface. Although CellDesigner can integrate with existing databases (KEGG, PubMed, BioModels, etc.), by calling a web browser, or connecting to its web page through HTTP, integration with SABIO-RK has the potential to expand connectivity and semi-automate visualization and model building. SABIO-RK contains information about biochemical reactions, related kinetic equations and parameters. Also information about the experimental conditions under which these parameters were measured is stored. By using the Web service API provided by the SABIO-RK team, we have succeeded to directly connect to the database, send search queries by ID or name of its component, and then import the query results into CellDesigner.

Keywords: systems biology, SBML, SBGN, database integration, Web services, kinetic modeling, biochemical simulation



Introduction

Systems biology is characterized by synergistic integration of theory, computational modeling and experiments [1]. While software infrastructure is one of the most crucial components of systems biology research, until recently there has been no common infrastructure or standard to enable integration of computational resources. To solve this problem, the Systems Biology Markup Language (SBML, http://sbml.org) [2, 3] and the Systems Biology Workbench (SBW, http://sbw.kgi.edu) have been developed [4]. SBML is an open, XML-based format for representing biochemical reaction networks, and SBW is a modular, broker-based, message-passing framework for simplified communication of models between applications. Rapid acceptance of this standard mean that >100 simulation and analysis software packages already support SBML, or are in the process to support them.

Identification of the logic and dynamics of gene-regulatory and biochemical networks is a major challenge for systems biology. We believe that the standardized technologies, such as SBML, SBW and SBGN (Systems Biology Graphical Notation), will play a critical role as the software platform to tackle this challenge. As one such approach, we have developed CellDesigner [5], a process diagram editor for gene-regulatory and biochemical networks. CellDesigner currently supports model creation, simulation, and database integration, which is important for users who want to create their model from scratch. In addition, inclusion of accession numbers or entity IDs for each database enables models to directly call databases from CellDesigner, thus enriching the information content of the model. These features will be useful if the modeling tool can import the information on-the-fly. CellDesigner already can directly import models from the BioModels database [6]. This enables users to efficiently open and simulate the BioModels inventory. Here, we expand this functionality by integrating the SABIO-Reaction Kinetics database (SABIO-RK) [7] with CellDesigner; allowing users can import additional information to each object on-the-fly. SABIO-RK is a web-accessible database containing biochemical reaction kinetics data for systems biologists. It merges general reaction information retrieved from external databases with kinetic data manually extracted from literature. SABIO-RK contains information about biochemical reactions, including compounds, kinetics and the pathways in which they participate. This information is accessible through the Internet either through a user interface or through a Web service API (Application Program Interface) [8]. By using the Web service API, we implemented an extension for CellDesigner whereby users can import information from SABIO-RK for each query object.



Celldesigner

The current version of CellDesigner has following features:

The aim in developing CellDesigner was to supply a process diagram editor utilizing standardized technology which was platform independent. By using standardized technology, any model can be easily ported to other applications, thereby reducing the cost to create a specific model from scratch. The main standardized features that CellDesigner supports are summarized as "graphical notation", "model description" and "application integration environment". The standard for graphical notation plays an important role for efficient and accurate dissemination of knowledge [10] and these standards for model description enhance the portability of models between software tools and aid human readability. Similarly, the standard for application integration environment will help software developers to provide the ability for their applications to communicate with other tools.


Symbols and expressions

CellDesigner supports graphical notation and listing of symbols based on a proposal by Kitano et al. [10]. The definition of graphical notation was an international effort to produce the Systems Biology Graphical Notation (SBGN, http://sbgn.org) standard. Although several graphical notation systems have been already proposed [11-15], each has drawbacks to becoming a standard [10]. SBGN has been designed for biological networks to describe sufficient information in an unambiguous way. We expect that these features will become part of the standardized technology for systems biology. The key components of SBGN are:

  1. To allow representation of diverse biological objects and interactions
  2. To be semantically and visually unambiguous
  3. To be able to incorporate notations
  4. To allow software tools to convert a graphically represented model into mathematical formulae for analysis and simulation
  5. To have software support to draw diagrams
  6. The notation scheme to be freely available

To accomplish above requirements the notation uses process diagrams to graphically represent the state transitions of the molecules. This process diagram representation consists of nodes indicating the state of molecule/complex, and edges that represent state transitions of molecules. In conventional entity-relationship diagrams, edges generally represent activation of the molecule. However, this confuses diagram semantics as well as limiting possible molecular processes that can be represented [10]. Process diagrams represent more intuitive way for model definition than entity-relationship diagrams, because the process diagram explicitly represents temporal sequences, i. e., whether or not the molecule is activated is represented as the state of the node. In process diagram promoting and inhibiting catalytic events are represented as a modifier of state transitions using a circle or bar headed line, respectively.

While process diagrams are the preferred solution for representing temporal sequences, both process and entity-relationship diagrams are equally suited to represent static networks. Moreover, both notations can maintain detailed information internally in machine-readable syntax such as SBML [10]. One of the foundations of SBGN is to supply a set of notations, which enhance formality and richness of the represented or human readable information. For such graphical notation to be practical and to be accepted by the community, it is essential that software tools and data resources be made available. Even if the proposed notation system satisfies the requirements of biologists any lack of software support will limit its advantages. CellDesigner currently supports the majority of the process diagram notation proposed, and will fully implement this in the near future (Fig. 1).


Figure 1: Screenshot of CellDesigner.




SBML Compliant

CellDesigner supports SBML reading and writing capabilities. SBML is a tool-neutral, computer-readable format for representing biochemical reaction models; applicable to metabolic networks, cell-signaling pathways, gene regulatory networks, and other modeling problems in systems biology. SBML is based on XML (eXtensible Markup Language), a simple, flexible text format for exchanging a wide variety of data. The initial version of the specification was released on March 2001 as SBML Level-1. The most recent released version of SBML is Level-2 Version 2 (as of Feb. 2007). CellDesigner utilizes SBML as its native model description language; therefore once a model is created using CellDesigner, all the information inside the model will be stored in SBML resulting in high model portability. For example, genes and proteins are stored as a list of <species> under <listOfSpecies> tag, and reactions are stored as a list of <reactions> under <listOfReactions> tag. Kinetic laws, which are required for ODE based simulation, are stored under <kineticLaw> tags which are also compatible with the MathML standard (Fig. 2).


Figure 2: Relationship between objects in CellDesigner and SBML.


However, the graphical notation and layout information produced by CellDesigner is not currently supported by SBML. So CellDesigner uses the <annotation> tag of SBML to store this information. If the SBML model has no CellDesigner compatible layout information then an auto-layout function can be run to layout SBML Level-1 and Level-2 models. By using this function, users can quickly layout existing SBML models such as KEGG converted models, models from BioModels database.


Database connection capability

A crucial process in network construction and analyses is linking databases from diverse sources. We have added this capability to enable direct connection with following databases:

Once a node and organism are selected, users can query databases via a pop-up menu. For example, the PubMed ID search utilizes notes written in the components. The BioModels database connection allows the import of SBML-based models, which are curated computational models prepared for simulations. For the integration with SABIO-RK, we have enhanced this function and implemented a new feature to import information from SABIO-RK to each reaction.



SABIO-RK and its web service API

SABIO-RK provides data about the kinetics of biochemical reactions in different organisms and tissues, determined under diverse experimental conditions [7]. The database originated from a database developed for the Mycoplasma pneumoniae, and focuses on querying reactions and their kinetic data. SABIO-RK merges information about biochemical reactions and pathways collected from other databases (e. g. KEGG) with corresponding kinetic and experimental data manually mined from the literature. This manual curation is assisted by semi-automatic tools and can be accessed via a web-based user interface or through Web service. Both the user interface and Web service support export to SBML.


SABIO-RK Web Service API

The SABIO-RK Web service [8] provides customizable language-independent points of entry into the SABIO-RK system. SABIO-RK Web service enables software developers to write their own client to customize and automate access to SABIO-RK. A Web service is broadly defined as a software system designed to support interoperable machine-to-machine interactions over a network, but practically the term refers to XML-based information exchange systems that use the Internet for direct application-to-application interaction [16]. SABIO-RK Web service uses SOAP (Simple Object Access Protocol) messages and SOAP-formatted XML envelopes with interface described in WSDL (Web Services Description Language). Basically, SABIO-RK Web service provides API to get corresponding information from their internal object by given object ID or name. For example, a client can receive a list of reactions (Reaction IDs) by calling "getReactionIDs" function with a name of pathway for its argument, and then receive a list of products or substrates in the reaction by calling "getProductsSpeciesIDs" or "getSubstratesSpeciesIDs" function with Reaction Instance ID for its argument. A "Reaction Instance ID" is the occurrence of a reaction (Reaction ID) in a particular organism, having a particular kinetic law, with certain modifiers/species, in particular locations. The list of Reaction Instance IDs is obtained by calling "getReactionInstanceIDs" function with Reaction ID for its argument. The relationship between each API and returned object, and also a flow of obtained information have previously been described [8]. In brevia the following information is accessible using the SABIO-RK Web service.

The Web service API also provides some functions that allow client to directly call search functions within SABIO-RK. The API provides "getPathwayNames", "searchCompounds", "searchEnzymesByName" and "searchEnzymesByECNumber" functions. These functions accept search strings for its argument, and then return an array of recommended pathways, compounds and enzyme names respectively. These search functions enhance the design flexibility of client application, and also reduce the computational and communication overheads, while each search function is done on the server side.

The API returns an array (or a single value) of integers or strings, but also some functions return XML strings. A function "GenerateSBMLentities" accepts an array of SABIO Reaction Instance IDs, a hash table of Reaction Instance IDs and its kinetic law IDs, SBML version and level number and the name of the SBML model for its arguments and returns generated SBML. This contains a model with specified reactions and kinetic laws. A client can receive a model in SBML so that the reactions, kinetic laws and parameters are easily imported.



Implementation and results

Integration of CellDesigner and SABIO-RK was implemented by using the SABIO-RK Web service API. The basic idea of integration is to add information obtained from SABIO-RK to an object which is created in CellDesigner. In CellDesigner, most of the objects (species, reactions, etc.) are stored as an SBML object. Additional information which is not supported by SBML, such as graphical and layout information and detailed description of species and reactions are stored as CellDesigner specific object. A SBML object is manipulated by using libSBML (a library to read, write, manipulate, translate and validate SBML files and data streams; http://www.sbml.org/software/libsbml/) [3]. Other CellDesigner specific objects are manipulated by its own API in a similar way [17]. Though there is a difference between these objects in CellDesigner, these objects (classes) have a close structure; therefore the model created by CellDesigner can be easily exported. SABIO-RK Web service API returns basic objects such as integers or strings, so we have implemented functions to import these returned values to the query objects by using libSBML within CellDesigner.

Our approach was:

  1. Search related reactions in SABIO-RK by the name or EC number of selected species in CellDesigner
  2. Obtain a list of reactions and its related kinetic laws
  3. Import the species specific kinetic laws

To do this, we listed all the required methods of SABIO-RK Web service and then implemented code to call these methods from CellDesigner. These methods receive return values from SABIO-RK Web service that is processed in CellDesigner. As shown in Fig. 3, the methods are called sequentially using these values.


Figure 3: Schematic view of integration of CellDesigner and SABIO-RK.


To obtain the kinetic laws from a given name or EC number of species, a client first calls "searchCompounds", "searchEnzymesByName" or "searchEnzymesByECNumber" to get recommended compounds or enzyme names. There is no rule what term should be used to describe each species' name between CellDesigner and SABIO-RK, so these above search functions are helpful to connect terms between them. After obtaining a species' name, client calls "getReactionIDFromEnzyme" or "getReactionIDFromCompound" thus obtaining an array of reaction IDs. Then the client calls "getKinLawIDs" with reaction ID to obtain of the associated kinetic law IDs. We have implemented a code to call "GenerateSBMLentities" with reaction ID and table of reactions and kinetic law IDs. The received SBML string is parsed by the libSBML API, and then kinetic laws and list of parameters are imported to corresponding reaction. Fig. 4 shows a screenshot of GUI window in CellDesigner using above implementation. The GUI window also accepts calling a search function manually with given name or EC number.


Figure 4: Screen shot of GUI window for integration of CellDesigner and SABIO-RK.


Another possible scenario is as follows:

  1. Search related reactions in SABIO-RK by the name of pathway
  2. Obtain a list of reactions and its related kinetic laws for the specified pathway
  3. Import whole set of reactions as an SBML model

The major difference here is the implementation of the code to call "getPathwayNames" and "getReactionIDs" methods. The "getPathwayNames" method returns a string array of pathway names matching the substring of the argument. "getReactionIDs" method that returns an array of reaction IDs using the pathway name for an argument. A client can obtain a list of reaction IDs that are involved in a pathway by calling these two methods. When a list of reaction IDs and kinetic law IDs are obtained, a client can call "GenerateSBMLentities" to obtain the whole pathway in SBML. Once the SBML model is obtained, CellDesigner can invoke an automatic layout. Both of above features and GUI are implemented in Java and incorporated to CellDesigner. The GUI window is accessible from [Database] menu in CellDesigner. Apache Axis API is used to call SABIO-RK Web service from Java.



Conclusion

In this paper, we explain how CellDesigner and SABIO-RK were integrated. This integration was based on a Web service API, which was developed by SABIO-RK team. CellDesigner uses SBML as a model description language, and additional annotations for graphical layout. Because SABIO-RK can export to SBML, the obtained information is easily imported to CellDesigner. Using a language-neutral method (such as Web services) to access databases and a tool-neutral model description language (SBML) assisted this integration. This integration enables users to add information from SABIO-RK to an object inside CellDesigner on-the-fly.



Acknowledgements

We would like to thank Saqib Mir (EML Research, Germany), Martin Golebiewski (EML Research, Germany) and Isabel Rojas (EML Research, Germany) for developing the SABIO-RK Web service, and for fruitful discussions. This research is in part supported by the ERATO-SORST program (Japan Science and Technology Agency), the International Standard Development area of the International Joint Research Grant (NEDO, Japanese Ministry of Economy, Trade, and Industry), Strategic Japanese-Swedish Cooperative Program on "Multidisciplinary BIO"(JST-VINNOVA/SSF), Establishment of a Human Genome Network Platform (MEXT) and through the special coordination funds for promoting science and technology from the Japanese government's Ministry of Education, Culture, Sports, Science, and Technology (MEXT).




References


  1. Kitano, H. (2002). Systems biology: A brief overview. Science 295, 1662-1664.

  2. Hucka, M., et al.; SBML Forum (2003). The Systems Biology Markup Language (SBML): A medium for representation and exchange of biochemical network models. Bioinformatics 19, 524-531.

  3. Hucka, M., Finney, A., Bornstein, B. J., Keating, S. M., Shapiro, B. E., Matthews, J., Kovitz, B. L., Schilstra, M. J., Funahashi, A., Doyle, J. C. and Kitano, H. (2004). Evolving a lingua franca and associated software infrastructure for computational systems biology: The Systems Biology Markup Language (SBML) Project. Syst. Biol. (Stevenage) 1, 41-53.

  4. Sauro, H. M., Hucka, M., Finney, A., Wellock, C., Bolouri, H., Doyle, J. and Kitano, H. (2003). Next generation simulation tools: the Systems Biology Workbench and BioSPICE integration. OMICS 7, 355-372.

  5. Funahashi, A., Tanimura, N., Morohashi, M. and Kitano, H. (2003). CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. BIOSILICO 1, 159-162.

  6. Le Novère, N., Bornstein, B., Broicher, A., Courtot, M., Donizelli, M., Dharuri, H., Li, L., Sauro, H., Schilstra, M., Shapiro, B., Snoep, J. L. and Hucka, M. (2006). BioModels Database: A Free, Centralized Database of Curated, Published, Quantitative Kinetic Models of Biochemical and Cellular Systems. Nucleic Acids Res. 34, D689-D691.

  7. Wittig, U., Golebiewski, M., Kania, R., Krebs, O., Mir, S., Weidemann, A., Anstein, S., Saric, J. and Rojas, I. (2006). SABIO-RK: Integration and Curation of Reaction Kinetics Data. In: Proceedings of the 3rd International workshop on Data Integration in the Life Sciences 2006 (DILS'06). Hinxton, UK. Lecture Notes in Bioinformatics 4075, 94-103.

  8. SABIO-RK Web Service Reference Manual. http://sabio.villa-bosch.de/SABIORK/webservicedoc.jsp

  9. Machné, R., Finney, A., Müller, S., Lu, J., Widder, S. and Flamm, C. (2006). The SBML ODE Solver Library: a native API for symbolic and fast numerical analysis of reaction networks. Bioinformatics 22, 1406-1407.

  10. Kitano, H., Funahashi, A., Matsuoka, Y. and Oda, K. (2005). Using process diagrams for the graphical representation of biological networks. Nat. Biotechnol. 23, 961-966.

  11. Kohn, K. W. (1999). Molecular interaction map of the mammalian cell cycle control and DNA repair systems. Mol. Biol. Cell 10, 2703-2734.

  12. Kohn, K. W. (2001). Molecular interaction maps as information organizers and simulation guides. Chaos 11, 84-97.

  13. Pirson, I., Fortemaison, N., Jacobs, C., Dremier, S., Dumont, J. E. and Maenhaut, C. (2000). The visual display of regulatory information and networks. Trends Cell Biol. 10, 404-408.

  14. Cook, D. L., Farley, J. F. and Tapscott, S. J. (2001). A basis for a visual language for describing, archiving and analyzing functional models of complex biological systems. Genome Biol. 2, RESEARCH0012.

  15. Maimon, R. and Browing, S. (2001). Diagramatic Notation and Computational Structure of Gene Networks. In: Proceedings of the Second International Conference on Systems Biology, 311-317.

  16. Web Services Architecture. W3C Working Group Note. http://www.w3.org/TR/ws-arch/

  17. The Systems Biology Institute and Mitsui Knowledge Industry Co. Ltd.(2006). CellDesigner Plugin Tutorial document. http://celldesigner.org/documents/PluginTutorial40A.pdf