In Silico Biology 8, 0015 (2008); ©2008, Bioinformation Systems e.V.  


ROSY - a flexible and universal database and bioinformatics tool platform for Roseobacter related species


Claudia Pommerenke1, Inga Gabriel1, Boyke Bunk1, Richard Münch1, Isam Haddad1, Petra Tielen1, Irene Wagner-Döbler2 and Dieter Jahn1*




1 Institut für Mikrobiologie, Technische Universität Braunschweig, Spielmannstr. 7, D-38106 Braunschweig, Germany
2 Helmholtz-Zentrum für Infektionsforschung GmbH, Inhoffenstr. 7, D-38124 Braunschweig, Germany



* Corresponding author

   Email: d.jahn@tu-bs.de
   Phone: +49-531-391 5891;  Fax: +49-531-391 5854





Edited by H. Michael; received January 16, 2008; revised March 08, 2008; accepted March 10, 2008; published March 25, 2008



Abstract

Systems biology approaches to bacteria require an integrated database and a bioinformatics tool platform to enable automated and manual annotation, regulatory and metabolic network deduction, and the storage of related experimental as well as predicted data. In this context ROSY - the ROseobacter SYstems biology database - was developed for completed and draft genomes of representatives of the marine Roseobacter clade, which constitutes one of the most abundant bacterial clades in the ocean. ROSY provides an integrative view on comprehensive data collections such as KEGG, GenBank, RoseoBase, BRENDA, and PRODORIC as well as mediates the use of connected tools for promoter analysis (Virtual Footprint), genome and pathway visualization (CGView, PathCompare), and prediction of signal peptides (PrediSi). Moreover, metabolome, transcriptome, and proteome data can be stored in ROSY, supplying an integrated platform for comparative genomics and systems biology. This entire database system along with the data retrieval, comparative analysis, and website presentation tools (http://rosy.tu-bs.de) can be easily adopted for the systems biological analysis of other bacterial groups.

Keywords: Roseobacter, database integration, comparative analysis



Introduction

In times of global warming considerable attention is drawn to the marine ecosystem because of its enormous potential to significantly influence the production and conversion of greenhouse gases. Marine microorganisms are the main catalysts of the oceanic carbon, sulfur, and nitrogen cycle, and form the basis of the food chain in the ocean [1]. In addition to their ecological importance, there is a strong biotechnological interest in bacterioplankton. The total number of novel proteins, discovered by a recent metagenome survey in the ocean bacterioplankton, almost equalled that of all previously known proteins from all organisms studied in the past and added significant diversity to known proteins. Thus, many more enzymes with unique and novel properties of commercial interest are expected from analyses of the marine ecosystem [2].

However, the majority of marine microbes could not be cultured until recently. Thus, culture-independent molecular biology approaches were applied and significantly enhanced our understanding of marine microbial communities [1, 2]. For instance, the novel pigment proteorhodopsin, a light driven proton pump in bacterial membranes, was discovered during random sequencing of marine environmental DNA [3]. Meanwhile, representatives of abundant marine phyla were successfully cultivated, and complete genome sequences are now available [4, 5]. One of the most versatile groups in the marine habitat is the Roseobacter lineage, a subgroup of the Rhodobacterales (α-proteobacteria), comprising 41 major species level clusters of closely related strains, both cultivated and uncultivated ones [6]. These bacteria colonize practically all marine habitats, including sediments, plankton, invertebrates, and fish. Certain phylotypes have been shown to comprise up to 20% of all cells in Antarctic bacterioplankton or in coastal environments [7]. They harbor complex physiologies, interesting secondary metabolites, and novel enzymes [8, 9]. The metabolic diversity and ecological importance of the Roseobacter clade have made it an important focus of genome sequencing projects. The first completed genomes - of Silicibacter pomeroyi and Roseobacter denitrificans [4, 10] - were followed by a massive sequencing approach by the Gordon & Betty Moore foundation, resulting in currently 32 draft and 6 finished genome sequences from the Roseobacter clade. In this situation, there is an urgent need for bioinformatics tools which allow firstly manual curation of automatically annotated genomes. Secondly, comparison between genomes of relatively closely related but ecologically distinct strains with respect to their specific regulatory and metabolic networks, unique enzymes, transporters, and regulators is required to obtain first insights to strain specific adaptation strategies. Thirdly, integration of these sequence derived data is needed with experimental high throughput transcriptome, proteome, and metabolome data, and other biochemical, genetic, and physiological data from the literature in order to proceed towards an understanding of the overall molecular adaptation strategies of these organisms to their habitats.

Automatic annotation procedures greatly enhance the information that can be extracted from the genomes. However, manual curation of the data is still required. Considering the high complexity of the marine ecosystem and the small fraction of knowledge about involved regulatory and metabolic networks, protein identification and deduced biochemical pathways suggested by automatic annotation procedures have to be thoroughly revised manually. ROSY, the ROseobacter SYstems biology database, supports manual curation of genome data by providing a web-accessible platform that integrates information of multiple data sources into one system enriched with necessary bioinformatics tools. This integrative system can be universally expanded to other bacterial groups.



System blueprint

In order to sustain systems biology approaches for the various sequenced Roseobacter clade members, at first, annotated genomes of high quality are required. Metabolic and gene regulatory models rely on correctly annotated genes. One challenging part of this procedure is the characterization of unfinished and raw-annotated Roseobacter clade genomes. In this context, the ROSY platform is beneficial for the comparison of unfinished and raw-annotated genomes to complete, annotated, and manually revised DNA sequences.

Upon completion of the genome annotation, software tools for regulatory and metabolic network prediction and subsequent data integration are needed. Consequently, ROSY has to provide bioinformatics tools for network prediction. At this point the ground has to be laid for the inclusion of experimental high throughput transcriptome, proteome, and metabolome data. These data have to be stored appropriately and interpretated using bioinformatic tools. Here, ROSY serves as a competent storage and automated interpretation platform. Subsequently, comparison of model and experimental data will provide the basis for first steps towards integrated systems biology models.

An appropiate platform for all these purposes was established previously for the genus Pseudomonas and called SYSTOMONAS. It is an integrated information and interpretation system that contains a collection of publicly available information from diverse data sources and includes useful predictions. It also provides helpful tools for the visualization and analysis of transcriptome, proteome, and especially metabolome data [11]. Therefore, the SYSTOMONAS structure served as a blueprint for the Roseobacter systems biology database and bioinformactics tool platform ROSY. However, several new features were developed to additionally provide ROSY with powerful comparative genome analysis tools in response to the specific requirements of the Roseobacter clade analysis. This will be described in the section 'Integration of new features'.

The system consists of two parts: (a) the data warehouse that contains all data and scripts to extract, transform, load, and compare the data from multiple sources and (b) the web portal, on which several visualization and analysis tools are readily accessible (Fig. 1) [12]. The first step in creating the database was to transfer all data from BioCyc [13], PRODORIC [14], KEGG [15], IMG [16], RoseoBase (www.roseobase.org), Pseudomonas Genome Database version 2 (PGDv2) [17], NCBI [18], Genome Reviews [19], Venter Institute, and ENZYME [20], which were available as flatfiles, into a database called metabold. From this container, the Roseobacter information was extracted to the specialized database ROSY. Data of further marine Rhodobacterales bacteria were included in order to gain improved protein annotation and first insights into gene regulatory networks. Once this extraction was finished, selected homology analyses and further enzyme annotation of the Roseobacter related bacteria were performed and incorporated into ROSY. Homologous proteins were identified by protein sequence comparison using conservative BLAST searches with an expectation value of less than 10-50 and a minimum sequence identity of 60 % [11]. Further missing enzymes were predicted by gene sequence comparison by using the tool metaSHARK [21]. The amendments provided by the enzyme annotation process are listed in Tab. 1. Proteins that are found homologous by this described automatic annotation process can be compared by mutliple sequence alignment on the ROSY website as already detailed for SYSTOMONAS [11].



Click on the thumbnail to enlarge the picture
Figure 1: Data warehouse and web portal for ROSY. Diverse data sources were integrated in the intermediate data container metabold. Subsequently, the core data of metabold and the results from comparative genomics were transferred into ROSY. These integrated data and the dynamically retrieved information of external web services can be accessed at the website http://rosy.tu-bs.de.


Table 1: Statistics of the the enzyme annotation for the metabolic network reconstruction in ROSY for Roseobacter-like marine bacteria.
SpeciesProteinsEnzyme annotation
   TotalKEGGGenome Reviews GenBank ENZYME RoseoBase Venter BLAST metaSHARK
Dinoroseobacter shibae DFL-1242776970 0 0 0 0 48873181
Erythrobacter litoralis3056 585 479 0 0 0 0 0 42 105
Hoeflea phototrophica DFL-434407 700 0 0 0 0 0 0 456 403
Jannaschia sp. CCS14336 1036 747 492 433 66 0 0 284 103
Loktanella vestfoldensis SKA533109 867 0 0 0 0 239 0 614 265
Oceanibulbus indolifex HEL-454196 954 0 0 0 0 0 263 665 271
Oceanicola batsensis4255 897 0 0 0 0 267 0 568 271
Oceanicola granulosus3807 899 0 0 0 0 256 0 572 297
Phaeobacter gallaeciensis BS1074118 971 0 0 0 0 0 0 900 346
Rhodobacterales bacterium4757 925 0 0 0 0 273 0 562 311
Rhodobacter sphaeroides 2.4.14370 1146 952 0 579 0 0 0 206 107
Roseobacter denitrificans OCh 1144191 1295 1121 966 963 54 0 0 148 94
Roseobacter litoralis4785 1262 0 0 0 0 0 265 1016 309
Roseobacter sp. CCS23696 865 0 0 0 0 71 0 729 321
Roseobacter sp. MED1934598 1061 0 0 0 0 285 0 750 295
Roseovarius nubinhibens ISM3597 888 0 0 0 0 238 0 637 247
Roseovarius sp. 2174817 1009 0 0 0 0 272 0 709 283
Sagittula stellata E-375112 1019 0 0 0 0 299 0 656 301
Silicibacter pomeroyi DSS-34348 1222 934 680 635 148 0 0 251 126
Silicibacter TM10403864 932 0 0 0 0 0 0 851 346
Stappia alexandrii DFL-115335 848 0 0 0 0 0 400 296 281
Sulfitobacter sp. EE-363529 888 0 0 0 0 0 0 810 321
Most enzyme information was provided by KEGG. Additionally, Genome Reviews (EBI), GenBank (NCBI), ENZYME, BioCyc, RoseoBase, and Venter Institute contributed to the functional annotation of enzymes. Via comparative genomics (BLAST, metaSHARK) missing enzymes were predicted. Please note, that some enzyme annotations by the databases are overlapping, so that the enzyme numbers of all resources do not sum up to the total number of enzymes necessarily. 0 values indicate that data were not available from the database until the 31st August 2007.


The web portal of ROSY combines data from the local database with data that are dynamically retrieved from remote databases by web services [11]. A web service is defined as a software system that ensures the communication between computers over a network. This exchange of information between servers is commonly conducted with the communication standard SOAP (originally for Simple Object Access Protocol). Web services can be implemented for both databases and tools. Meanwhile, web services are established for KEGG [15], EBI [22], NCBI (www.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html) , PRODORIC [23], and BRENDA [24] amongst many other databases and still the number of web services is growing. The fundamental advantage of web services is that they allow access to up-to-date information from the external service provider and thereby make the curation of this data part dispensable. Therefore, web services were combined with the data warehouse for the ROSY platform (Tab. 2).


Table 2: External web services implemented via SOAP and embedded tools on the ROSY websites.
Database / ToolFunctionROSY website form
Web services
BRENDAKinetic and disease data EC
Sabio-RK (new)Kinetic data EC
PRODORICOperon, TFBS, transcriptomics data Gene, interaction, transcriptomics
PrediSi (new)Prediction of signal peptides Protein
KEGGMetabolic pathway maps Pathway
Tools
Virtual FootprintPromoter regulon + analysis PRODORIC tool
CGView (new)Genome view Gene
JalviewMultiple sequence alignment Protein
BLAST (new)Searching homologous proteins Protein search
PathCompareVisualization of pathways EC
PathCheck (new)Comparison of pathways between two species Pathway search
Metabolome analysisPlotting metabolome data Omics search
Since the first release of SYSTOMONAS, new web services and features were included in order to enhance comparative analysis of partially annotated genomes. TFBS - transcription factor binding sites; EC - enzyme commision number, a numerical classification scheme for enzymes; interaction - includes metabolic reactions and relations between transcription factors and regulated genes.


Furthermore, the ROSY database structure is prepared to store and analyze future experimental data from the fields of metabolomics, transcriptomics, and proteomics. Finally, several tools were adjusted to the ROSY platform (Tab. 2) such as the metabolic pathway visualization tool PathCompare and Virtual Footprint [23], a tool for finding transcription factor binding sites in promoter regions of the genes. Consequently, PathCompare now indicates enzymes of Roseobacter in the metabolic pathways and the integrated Virtual Footprint software predicts genome based regulons of Roseobacter.



Integration of new features

Several new features and amendments to the previous system were introduced in order to improve ROSY in terms of comparative genomics (Tab. 2). On the level of genome sequences, the CGView was implemented to visualize the genomic environment of a certain gene. This is helpful with regard to the bacterial organization of genes in clusters or even operons. The gene neighborhood often provides information on the potential function and regulation of certain genes. CGView [25] was integrated in two different ways. Each gene entry contains a static clickable image, which depicts possible operon structures. Furthermore, CGView was also introduced as an applet, where zooming into the gene detail and rotating the complete genome is possible in an interactive way (Fig. 2).



Click on the thumbnail to enlarge the picture
Figure 2: Gene neighborhood of hemA visualized by CGView. This screenshot is taken from the hemA gene entry of Roseobacter denitrificans. Several genes coding for proteins that support the production of photosynthetically active pigments, are closely located to hemA.


Further improvement was done for the characterization of proteins. PrediSi, a tool for predicting signal peptides of proteins and thus hinting on membrane and extracellular proteins [26], was implemented on the protein entries via web service (Fig. 3, Tab. 2). In order to get an insight of the function for an uncharacterized protein, BLAST [27] was introduced to the protein search. Hereby, homologous proteins of related Roseobacter species can be found, for which functional annotations are available. Each BLAST search is performed on all protein sequences in ROSY.



Click on the thumbnail to enlarge the picture
Figure 3: Protein entry for the efflux transporter protein AcrE of Roseobacter denitrificans. The membrane protein prediction tool PrediSi indicates the cleavage site of AcrE and a probability score.


In order to detect the differences between the organisms, the interactive tool PathCheck was developed for comparing metabolic pathways of metabolic capabilities between two different organisms (Fig. 4). PathCheck counts all EC numbers present in a given metabolic pathway and the EC numbers for each organism. In this way, the coverage of these pathways for the single species can be easily perceived. Moreover, this is particularly helpful for analyzing the difference between two organisms.



Click on the thumbnail to enlarge the picture
Figure 4: Comparison of the metabolic enzyme coverage between two different marine prokaryotes. The integrated tool PathCheck highlights differences between Roseobacter denitrificans (RDE) and Rhodobacter sphaeroides (RSP) in regard to their metabolic properties.


Kinetic parameters of proteins are retrieved from BRENDA [24] and from Sabio-RK [28] (Tab. 2) and are displayed within the EC entry on the ROSY website. Data are dynamically retrieved both from the Sabio-RK and BRENDA web services.



Application and conclusion

The complex interactions and dependencies within marine microbial communities are a huge reservoir of biological information yet to be discovered. In the past few years, genomics approaches have started to gain remarkable insights into the corresponding complex and dynamic bacterial physiologies. Considering the fast growing knowledge about these marine ecosystems, ROSY is a helpful tool for the functional characterization of selected Rhodobacteriales genomes and a 'ready-to-use' container for future 'omics' data.

ROSY was developed in order to faciliate revision of automatically annotated data. Both draft and complete Roseobacter genomes were included such as Dinoroseobacter shibae strain DFL-12 [29], which has been sequenced recently. This organisms has undergone an automatic annotation, and is currently in the process of careful revision.

Although several other websites include Roseobacter specific data like RoseoBase (http://www.roseobase.org) or hundreds of genomes of distantly related species like IMG [16] and MicrobesOnline [30], the advantage of ROSY is that it can be populated with individual complete and even draft genomes of a desired coherent phylogenetic group. Especially during the manual curation process of a sequenced genome, ROSY provides useful tools and supplementary information that support and facilitate the assignment of gene functions. In comparison to related projects that simply summarize genomes of Roseobacter (http://www.roseobase.org) and other distantly related species [16, 30], the focus of ROSY is on the entire integration of both complete and draft genomes of a desired coherent phylogenetic group.

Most importantly, in a next step the genome derived data can be combined with experimental 'omics' data, annotated literature, and predicted data. Many necessary tools for this purpose are provided by ROSY. In this way, the generation of systems biology models and their experimental validation are sustained.



Acknowledgements

We thank Ida Retter for thorough reading of the manuscript. This work was funded by the German Bundesministerium für Bildung und Forschung (BMBF) for the National Genome Research Network (NGFN2-EP, Grant No. 0313398A), BMBF for the Bioinformatics Competence Center Intergenomics (Grant No. 031U110A / 031U210A), and the Volkswagen Foundation.




References


  1. DeLong, E. F. and Karl, D. M. (2005). Genomic perspectives in microbial oceanography. Nature 437, 336-342.

  2. Yooseph, S., Sutton, G., Rusch, D. B., Halpern, A. L., Williamson, S. J., Remington, K., Eisen, J. A., Heidelberg, K. B., Manning, G., Li, W., Jaroszewski, L., Cieplak, P., Miller, C. S., Li, H., Mashiyama, S. T., Joachimiak, M. P., vanBelle, C., Chandonia, J.-M., Soergel, D. A., Zhai, Y., Natarajan, K., Lee, S., Raphael, B. J., Bafna, V., Friedman, R., Brenner, S. E., Godzik, A., Eisenberg, D., Dixon, J. E., Taylor, S. S., Strausberg, R. L., Frazier, M. and Venter, J. C. (2007). The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 5, e16.

  3. Béjà, O., Aravind, L., Koonin, E. V., Suzuki, M. T., Hadd, A., Nguyen, L. P., Jovanovich, S. B., Gates, C. M., Feldman, R. A., Spudich, J. L., Spudich, E. N. and DeLong, E. F. (2000). Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289, 1902-1906.

  4. Swingley, W. D., Sadekar, S., Mastrian, S. D., Matthies, H. J., Hao, J., Ramos, H., Acharya, C. R., Conrad, A. L., Taylor, H. L., Dejesa, L. C., Shah, M. K., O'Huallachain, M. E., Lince, M. T., Blankenship, R. E., Beatty, J. T. and Touchman, J. W. (2007). The complete genome sequence of Roseobacter denitrificans reveals a mixotrophic rather than photosynthetic metabolism. J. Bacteriol. 189, 683-690.

  5. Moran, M. A., Belas, R., Schell, M. A., González, J. M., Sun, F., Sun, S., Binder, B. J., Edmonds, J., Ye, W., Orcutt, B., Howard, E. C., Meile, C., Palefsky, W., Goesmann, A., Ren, Q., Paulsen, I., Ulrich, L. E., Thompson, L. S., Saunders, E. and Buchan, A. (2007). Ecological genomics of marine Roseobacters. Appl. Environ. Microbiol. 73, 4559-4569.

  6. Wagner-Döbler, I. and Biebl, H. (2006). Environmental biology of the marine Roseobacter lineage. Annu. Rev. Microbiol. 60, 255-280.

  7. Selje, N., Simon, M. and Brinkhoff, T. (2004). A newly discovered Roseobacter cluster in temperate and polar oceans. Nature 427, 445-448.

  8. Moran, M. A. and Miller, W. L. (2007). Resourceful heterotrophs make the most of light in the coastal ocean. Nat. Rev. Microbiol. 5, 792-800.

  9. Wagner-Döbler, I., Thiel, V., Eberl, L., Allgaier, M., Bodor, A., Meyer, S., Ebner, S., Hennig, A., Pukall, R. and Schulz, S. (2005). Discovery of complex mixtures of novel long-chain quorum sensing signals in free-living and host-associated marine alphaproteobacteria. Chembiochem. 6, 2195-2206.

  10. Moran, M. A., Buchan, A., González, J. M., Heidelberg, J. F., Whitman, W. B., Kiene, R. P., Henriksen, J. R., King, G. M., Belas, R., Fuqua, C., Brinkac, L., Lewis, M., Johri, S., Weaver, B., Pai, G., Eisen, J. A., Rahe, E., Sheldon, W. M., Ye, W., Miller, T. R., Carlton, J., Rasko, D. A., Paulsen, I. T., Ren, Q., Daugherty, S. C., Deboy, R. T., Dodson, R. J., Durkin, A. S., Madupu, R., Nelson, W. C., Sullivan, S. A., Rosovitz, M. J., Haft, D. H., Selengut, J. and Ward, N. (2004). Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment. Nature 432, 910-913.

  11. Choi, C., Münch, R., Bunk, B., Barthelmes, J., Ebeling, C., Schomburg, D., Schobert, M., and Jahn, D. (2007). Combination of a data warehouse concept with web services for the establishment of the Pseudomonas systems biology database SYSTOMONAS. J. Integrative Bioinformatics 4, 48.

  12. Choi, C., Münch, R., Leupold, S., Klein, J., Siegel, I., Thielen, B., Benkert, B., Kucklick, M., Schobert, M., Barthelmes, J., Ebeling, C., Haddad, I., Scheer, M., Grote, A., Hiller, K., Bunk, B., Schreiber, K., Retter, I., Schomburg, D. and Jahn, D. (2007). SYSTOMONAS-an integrated database for systems biology analysis of Pseudomonas. Nucleic Acids Res. 35, D533-D537.

  13. Karp, P. D., Ouzounis, C. A., Moore-Kochlacs, C., Goldovsky, L., Kaipa, P., Ahrén, D., Tsoka, S., Darzentas, N., Kunin, V. and López-Bigas, N. (2005). Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 33, 6083-6089.

  14. Münch, R., Hiller, K., Barg, H., Heldt, D., Linz, S., Wingender, E. and Jahn, D. (2003). PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Res. 31, 266-269.

  15. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. and Hattori, M. (2004). The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277-D280.

  16. Markowitz, V. M., Korzeniewski, F., Palaniappan, K., Szeto, E., Werner, G., Padki, A., Zhao, X., Dubchak, I., Hugenholtz, P., Anderson, I., Lykidis, A., Mavromatis, K., Ivanova, N. and Kyrpides, N. C. (2006). The integrated microbial genomes (IMG) system. Nucleic Acids Res. 34, D344-D348.

  17. Winsor, G. L., Lo, R., Sui, S. J. H., Ung, K. S. E., Huang, S., Cheng, D., Ching, W.-K. H., Hancock, R. E. W. and Brinkman, F. S. L. (2005). Pseudomonas aeruginosa Genome Database and PseudoCAP: facilitating community-based, continually updated, genome annotation. Nucleic Acids Res. 33, D338-D343.

  18. Pruitt, K. D., Tatusova, T. and Maglott, D. R. (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61-D65.

  19. Kersey, P., Bower, L., Morris, L., Horne, A., Petryszak, R., Kanz, C., Kanapin, A., Das, U., Michoud, K., Phan, I., Gattiker, A., Kulikova, T., Faruque, N., Duggan, K., Mclaren, P., Reimholz, B., Duret, L., Penel, S., Reuter, I. and Apweiler, R. (2005). Integr8 and Genome Reviews: integrated views of complete genomes and proteomes. Nucleic Acids Res. 33, D297-D302.

  20. Bairoch, A. (2000). The ENZYME database in 2000. Nucleic Acids Res. 28, 304-305.

  21. Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792-1797.

  22. Labarga, A., Valentin, F., Anderson, M. and Lopez, R. (2007). Web services at the European bioinformatics institute. Nucleic Acids Res. 35, W6-W11.

  23. Münch, R., Hiller, K., Grote, A., Scheer, M., Klein, J., Schobert, M. and Jahn, D. (2005). Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes. Bioinformatics 21, 4187-4189.

  24. Barthelmes, J., Ebeling, C., Chang, A., Schomburg, I. and Schomburg, D. (2007). BRENDA, AMENDA and FRENDA: the enzyme information system in 2007. Nucleic Acids Res. 35, D511-D514.

  25. Stothard, P. and Wishart, D. S. (2005). Circular genome visualization and exploration using CGView. Bioinformatics 21, 537-539.

  26. Hiller, K., Grote, A., Scheer, M., Münch, R. and Jahn, D. (2004). PrediSi: prediction of signal peptides and their cleavage positions. Nucleic Acids Res. 32, W375-W379.

  27. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403-410.

  28. Rojas, I., Golebiewski, M., Kania, R., Krebs, O., Mir, S., Weidemann, A. and Wittig, U. (2007). Storing and annotating of kinetic data. In Silico Biol. 7 S1, 05.

  29. Biebl, H., Allgaier, M., Tindall, B. J., Koblizek, M., Lünsdorf, H., Pukall, R. and Wagner-Döbler, I. (2005). Dinoroseobacter shibae gen. nov., sp. nov., a new aerobic phototrophic bacterium isolated from dinoflagellates. Int. J. Syst. Evol. Microbiol. 55, 1089-1096.

  30. Alm, E. J., Huang, K. H., Price, M. N., Koche, R. P., Keller, K., Dubchak, I. L., and Arkin, A. P. (2005). The MicrobesOnline Web site for comparative genomics. Genome Res. 15, 1015-1022.