In Silico Biology 2, 0036 (2002); ©2002, Bioinformation Systems e.V.  
G C B ' 0 1


Supporting genotype-phenotype correlation with the rare metabolic diseases database Ramedis

Thoralf Töpel1, Uwe Scholz2, Ulrike Mischke3, Dagmar Scheible3, Ralf Hofestädt4, Friedrich Trefz3




1Otto-von-Guericke University Magdeburg, Institute for Technical and Business Information Systems, Germany
E-mail: toepel@iti.cs.uni-magdeburg.de
2Institute for Plant Genetics and Crop Plant Research Gatersleben, Plant Genome Resources Centre, Germany
3Childrenīs Hospital Reutlingen, Germany
4Bielefeld University, Faculty of Technology, Germany





Edited by E. Wingender; received December 14, 2001; accepted January 8, 2001; published April 02, 2002


Abstract

To gain further knowledge about rare genetic diseases, a world wide method for data collection via the Internet has been established. This new approach will improve collecting valuable data from single case reports. Ramedis saves standardised patient data which will be usable for statistics, longitudinal examinations and cooperative studies in future time. Embedded in the scene of the German Human Genome Project, Ramedis directly will enable phenotype-genotype correlations. Beside the better characterisation of clinical heterogeneity of rare metabolic diseases, there may be a great benefit for the treatment of these patients in whom prospective studies are otherwise expensive and difficult to perform. This contribution presents the motivation for this system, introduces features, current state and the future of the project. Additionally, first experiences of using Ramedis by health professionals are explained.

Key words: case study, database, genotype-phenotype correlation, information system, rare metabolic disease, remote data entry


Introduction

A lot of publications, for example given in proceedings, present case reports especially of rare metabolic diseases. They represent valuable information for better understanding of rare conditions. However, most of these information may be "lost" using common publication forms.

Ramedis is another approach of publishing case reports via the Internet. In contrast to electronical journals we use a formal procedure for entering the data into Ramedis. As only standardised data are allowed (there are only few free text fields e. g. "Abstract") the data will be usable for studies and statistical evaluation.

In future, a publishing committee will review each submitted case report. The aim is to collect high quality data on a central server, whereby the rights stay with the corresponding author. Furthermore, since molecular data are also included as far as available, Ramedis will be useful for genotype-phenotype investigations.



From genotype to phenotype

Inborn errors of metabolism are characterised by a block in a metabolic pathway, a deficiency of a transport protein or a defect in a storage mechanism caused by a gene defect. The defect gene leads up to an absent or wrong production of essential proteins, especially enzymes. But these enzymes are important components of the biochemical processes in cells and tissues. They enable, disable or catalyse the biochemical reactions of metabolic pathways. Thus, these disorders of the metabolism result in a threatening deficiency or accumulation of intermediate metabolites in the human organism and their following corresponding symptoms.

If a patient is suspected of having an inborn error of metabolism, specialised biochemical laboratories analyse enzyme activities in specimen of different tissues (skin, liver etc.) and investigate body liquids as blood, urine etc. for unusual metabolic pattern. With molecular methods it is also possible to confirm a diagnosis and to define the defect gene. In a screening procedure a number of inborn metabolic errors can be already examined prenatally or immediately after birth, e. g. Phenylketonuria (PKU).

For inborn errors of metabolism a lot of data is available in different databases accessible via Internet. A huge number of genes, enzymes and metabolic pathways have already been identified, isolated, sequenced and collected in these databases. For example, EMBL [1] and GenBank [2] contain DNA sequences and TRANSFAC [3] the knowledge about gene expression. Metabolic pathways and their single biochemical reactions are stored in the KEGG [4] system. Whereas BRENDA [5] provides the behavior of enzymatic driven processes. For medical data the databases MD-Cave [6] or Metagene [7] can be used. Most inborn errors of metabolism are also included in OMIM [8].

The amount of this electronically available knowledge of genes, enzymes, metabolic pathways and metabolic diseases increases rapidly. But they give only highly specialised views of the biological systems. These lead up to the general task of integrating all this knowledge and make it biotechnologically and medically applicable.

Within the scope of the German Human Genome Project a consortium of five partners has been founded to develop a bioinformatics system for representing, modeling and simulating genetic effects on gene regulation and metabolic processes in human cells. Thus, the correlation between the genotype and the clinically apparent phenotype will be established.



User interface

For the physician the user interface plays an important role for the acceptance of a system. This factor is often underestimated. An easy to handle system, oriented at common standards was developed. The use of Ramedis is free, but registration (user account) is required.

Clinical symptoms, laboratory finding results, molecular genetics, data concerning the patients diet regimes and the application of medicaments are collected. Additionally, there is the possibility to store clinical findings in pictures (X-rays, MRI-scans, histopathological data etc.). The user could either analyse all data already present in the database in a anonymous manner, or he can commit "new" cases or edit his own case reports and modify them. Notice, that most fields are filled with selection lists and not by typing. Therefore, wrong spelling or the use of multiple synonyms is avoided. The Figures 1 and 2 show the screenshots of the Data Input and Editing Tool of Ramedis: main data with abstract in the first picture, the table with the laboratory finding results and the selection window of the laboratory substances in the next picture.

 
Figure 1: Screenshot of the Data Input and Editing Tool of Ramedis: main data with different input field, e. g. diagnosis, date of diagnosis and abstract


 
Figure 2: Screenshot of the Data Input and Editing Tool of Ramedis: laboratory finding results table with selection window of the laboratory substances

The analysis tool offers the possibility to evaluate all data stored in Ramedis. The user can send different queries to the system, e. g. to select all case reports with the same diagnosis or all metabolic patients of one centre. Last but not least the user may look for similar cases by entering a combination of laboratory findings and symptoms. If interested in a special case out of this list, the user may select this case report. Screenshots of the presentation of the main data and the laboratory finding results with different graphical visualisations are shown in Figures 3 and 4. Notice, that the data is exposed anonymously without birth dates, initials etc. Thus confidentiality to the data is guaranteed.

 
Figure 3: Screenshot of the Data Analysis Tool of Ramedis: presentation of main data with anonymised information

 
Figure 4: Screenshot of the Data Analysis Tool of Ramedis: table of laboratory finding results with different graphical visualisations




Technical background

All data of Ramedis are stored in one common database. For this persistent storage of the information an Oracle database management system in the version 8.0.5 is used. The system runs on a dual Intel processor personal computer with a Red Hat Linux operating system. For the connection to the Internet an Apache web server is installed. The internet domain of Ramedis is www.ramedis.de.

As mentioned before Ramedis is divided in an input/editing part and an analysis component. The Data Input and Editing Tool is implemented as Java application with Swing components. Java is a platform independent programming language from the Sun cooperation, which is mainly developed for Internet implementations. Swing is an extension of Java for the design of "nice" graphical user interfaces. For the connection between the tool and the Ramedis database JDBC is used. JDBC is an interface which is provided from database management system developers for an easy access to the databases by Java applications.

The Analysis Tool is based on the Oracle WebDB system. WebDB includes a separate web server and realises a direct connection to the Oracle database but is available in the web via the same address (URL).



Current state and future

Since November 2000 patient data is collected online. Up to now, December 2001, about 290 case reports of patients with rare metabolic diseases, some with more than hundreds of traits, have been committed. Further outside laboratories will be connected within 2002. The final version of Ramedis will include also data concerning some more detailed family history. The following table shows an extract from the present state of the information stored in the database as of 14th December 2001.

Number of valid authors 25
Number of inserted patients 295
Allowed values of standardised input fields
Number of available diagnoses 345
Number of available symptomes 626
Number of available lab findings 699
Number of available lab findings in different specimen 1335
Amount of data committed by authors
Total number of committed clinical symptomes 1570
Average number of committed clinical symptomes per patient 5.32
Total number of committed lab finding results 5282
Average number of committed lab findings results per patient 17.91
Number of committed pictures 45

The vision is to implement a world wide used system to bring the data of the very seldom cases of inborn errors of metabolism together. Comparing the case reports of a special disease, a better characterisation of clinical heterogeneity of this disease could be obtained. As immediate aim for the Human Genome Project, the collected clinical data is used for the identification of genotype-phenotype correlation. The data may also be used for further characterisation of a genetic metabolic disease or for epidemiological investigations.

As also data of therapy regimes will be stored, Ramedis will be very useful for longitudinal and/or cooperative therapy studies. Furthermore Ramedis will offer the basis for quality assurance between different metabolic centres concerning therapeutic outcomes.



Related works

Two examples of data collection via the Internet show the benefit which Ramedis will offer: The database for phenylketonuria (PAHdb) was created by the group of Charles Scriver in Montreal, who is the world leading expert for genetic metabolic disorders. This database (www.mcgill.ca/pahdb) became a very powerful tool for collecting molecular data, e.g. mutations as well as some clinical parameters for a genotype-phenotype correlation [9].

The aim of a project initiated by N. Blau at the University of Zurich is the collection of data from the specialised field of tetrahydrobiopterine deficiencies (www.bh4.org). Citation frequencies show that the information in these databases is highly valuable, contributing to a better understanding of genotype-phenotype correlation. Each of these two approaches covers only one single inborn error of metabolism. Ramedis offers the possibility to collect data of different rare metabolic diseases in the form of single case reports world-wide.



Evaluation

After one year of experience using Ramedis to publish and analyse case reports with health professionals, it is now possible to look back to the design decisions, their realisation and to review those. Hereby, some comments according the further development of the system were taken into explicit consideration.

Regarding the support of clinical studies, the possibility to collect data about different rare metabolic diseases with hundreds of variable traits complicates this procedure. Most of these clinical studies need only a small, well defined set of characteristics, which have to be committed rapidly. So it is planned to develop a software to generate study specific input masks embedded in an appropriate security architecture usable with a common web browser.

At present, the Analysis Tool is reengineered to change the WebDB-based interface to PHP-driven web pages. This step enables more users to work with Ramedis without asking their local web administrators to change the firewall properties. Otherwise, a special Oracle-SQL*net communication port has to be opened.

To motivate further health professionals to submit their single case reports to the database, it is planned to enable an electronical publication of their work, which can be referenced by the International Standard Serial Number (ISSN). In this connection, the foundation of a review committee becomes even more inportant.



Acknowledgments

This work is supported by the German Ministry of Education and Research in the German Human Genome Project (Project "Modeling of gene regulatory networks for linking genotype-phenotype information") grant 01KW9962/6 and 01KW9912.


References

  1. Stoesser, G., Baker, W., van den Broek, A., Camon, E., Garcia-Pastor, M., Kanz, C., Kulikova, T., Lombard, V., Lopez, R., Parkinson, H., Redaschi, N., Sterk, P., Stoehr, P. and Tuli, M. A. (2001). The EMBL nucleotide sequence database. Nucleic Acids Res. 29, 17-21.
  2. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., Rapp, B. A. and Wheeler. D. L. (2000). GenBank. Nucleic Acids Res. 28, 15-18.
  3. Wingender, E., Chen, X., Hehl, R., Karas, H., Liebich, I., Matys, V., Meinhardt, T., Prüß, M., Reuter, I. and Schacherer, F. (2000). TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 28, 316-319.
  4. Kanehisa, M. and Goto, S. (2000). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27-30.
  5. Schomburg, I., Chang, A., Hofmann, O., Ebeling, E., Ehrentreich, F. and Schomburg, D. (2002). BRENDA: a resource for enzyme data and metabolic information. Trends Biochem. Sci. 27, 54-56.
  6. Hofestädt, R., Mischke, U., Scholz, U. (2000). Knowledge acquisition, management and representation for the diagnostic support in human inborn errors of metabolism. Stud. Health Technol. Inform. 77, 857-62.
  7. Frauendienst-Egger, G. and Trefz, F. K. (1998). METAGENE 3.0 Computersystem zur Diagnoseunterstützung angeborener Stoffwechselerkrankungen. Wissenschaftliche Verlagsgesellschaft mbH Stuttgart.
  8. Hamosh, A., Scott, A. F., Amberger, J., Valle, D. and McKusick, V. A. (2000). MDI SPECIAL ARTICLES - Online Mendelian Inheritance in Man (OMIM). Hum. Mutat. 15, 57-61.
  9. Nowacki, P. M., Byck, S., Prevost, L. and Scriver, C. R. (1998). PAH Mutation Analysis Consortium Database: 1997. Prototype for relational locus-specific mutation databases. Nucleic Acids Res. 26, 220-225.