|
"Good annotation practice" for chemical data in biologyKirill Degtyarenko*, Marcus Ennis and John Garavelli
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Abstract A structural diagram, in the form of a two-dimensional (2-D) sketch, remains the most effective portrait of a "small molecule" or chemical reaction. However, such structural diagrams, as for any other core data, cannot be used in speech (and should not be used in free text). "Good annotation practice" for biological databases is to use either consistent and widely recognised terminology or unique identifiers from a dedicated database to refer to the molecule of interest. Ideally, scientists should use terminology that is both pronounceable and meaningful. Thus, a viable solution for a bioinformatician is to use a definitive controlled vocabulary of biochemical compounds and reactions, which contains both systematic and common names. In addition, chemical ontologies provide a means for placing entities of interest into wider chemical, biological or medical contexts. We present some challenges and achievements in the standardisation of chemical language in biological databases, with emphasis on three aspects of annotation:
|