Current systems that are used for the management of manually edited sequence annotations in genome sequencing projects do still not satisfy the needs of the biologists nor do they satisfy the demands for storing and retrieving data safely and reliably.
We are developing a generic annotation management system (GAMS) that provides a suitable model
of genomic entities and a generic application programming interface (API) as well as remote access
capabilities via CORBA and document exchange via
XML. This system is not specialized or restricted to
a specific organism or a specific annotation project.
It may also be used to back up automatic annotation
systems like Pedant [Frishman and Mewes, 1997] or Magpie [Gaasterland and Sensen, 1996]
Objectoriented analysis (OOA) provides a good foundation to design a suitable and adaptable model for the genomic entities that are found in a cell. We have designed classes for the main molecule types DNA, RNA, proteins and (other) metabolites (Fig. 1).
| Figure 1: Extract of the UML representation of the molecule model. |
![]() |
As an example, we model Dna objects as containers for DNA sequences with annotated elements like ORFs or tRNAs. The Orf objects correspond to MessengerRna objects, and given a gene model these MessengerRnas correspond to certain Polypeptid objects. A Protein object contains several Polypeptids, having certain properties and cross references to other literature or protein database sources.
Another application of this data model is the dy
namic modelling of metabolic pathways [Kastenmüller and Mewes, 1999].
The API provides all the basic operations to work on genomic objects (Fig. 2). Doing so it hides all the technical aspects of database programming, hence a client programmer can focus on designing clients rather than learning the organization of the database.
| Figure 2: Architecture of the API. |
![]() |
A CORBAlayer provides remote access to the
methods of the objects that are stored in the
database. Alternatively objects can also be dumped
into an XML document and vice versa to enable data
exchange between different applications.
To become independent of certain software systems like database management systems we direct the API calls to a driver software. Drivers are responsible for managing specific operations on the corresponding system.
The concept of abstract drivers allows us to change the implementation of existing drivers or to add new drivers to the API without changing the API. All operations are declared by the abstract driver while specific drivers implement these methods for certain systems.