Analysis of Gene Ontology features in microarray data using the Proteome BioKnowledge®
Robin J. Johnson1*, Jennifer M. Williams1, Barbara M. Schreiber2, Charles D. Elfe1, Kelley L. Lennon-Hopkins1, Marek S. Skrzypek3 and Renee D. White
1 Biobase Corporation, 100 Cummings Center, Ste. 420B, Beverly, MA 01915
Microarray technology has resulted in an explosion of complex, valuable data. Integrating data analysis tools with a comprehensive underlying database would allow efficient identification of common properties among differentially regulated genes. In this study we sought to compare the utility of various databases in microarray analysis.
The Proteome BioKnowledge® Library (BKL), a manually curated, proteome-wide compilation of the scientific literature, was used to generate a list of Gene Ontology (GO) Biological Process (BP) terms enriched among proteins involved in cardiovascular disease. Analysis of DNA microarray data generated in a study of rat vascular smooth muscle cell responses revealed significant enrichment in a number of GO BPs that were also enriched among cardiovascular disease-related proteins. Using annotation from LocusLink and chip annotation from the Gene Expression Omnibus yielded fewer enriched cardiovascular disease-associated GO BP terms. Data sets of orthologous genes from mouse and human were generated using the BKL Retriever. Analysis of these sets focusing on BKL Disease annotation, revealed a significant association of these genes with cardiovascular disease. These results and the extensive presence of experimental evidence for BKL GO and Disease features, underscore the benefits of using this database for microarray analysis.