ISB Home



- Article -





Volume 8


Full article

In Silico Biology 8, 0006 (2007); ©2007, Bioinformation Systems e.V.  



BioCompass: A novel functional inference tool that utilizes MeSH hierarchy to analyze groups of genes

Takeru Nakazato1,2,3*, Toru Takinaka4, Hironori Mizuguchi5, Hideo Matsuda2, Hidemasa Bono3 and Minoru Asogawa1,6

1 Bio-IT Business Promotion Center, NEC Corporation, 34 Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan
2 Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
3 Database Center for Life Science, Research Organization of Information and Systems, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
4 Bio-IT Business Promotion Center, NEC Corporation, 5-7-1 Shiba, Minato-ku, Tokyo 108-8001, Japan
5 Internet Systems Research Laboratories, NEC Corporation, 8916-46 Takayama-cho, Ikoma, Nara 630-0101, Japan
6 Fundamental and Environmental Research Laboratories, NEC Corporation, 34 Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan


* Corresponding author
   Email: nakazato@dbcls.rois.ac.jp


Edited by E. Wingender; received May 01, 2007; revised August 17, October 16, and December 17, 2007; accepted December 17, 2007; published December 22, 2007


Abstract

Microarray technology has become employed widely for biological researchers to identify genes associated with conditions such as diseases and drugs. To date, many methods have been developed to analyze data covering a large number of genes, but they focus only on statistical significance and cannot decipher the data with biological concepts. Gene Ontology (GO) is utilized to understand the data with biological interpretation; however, it is restricted to specific ontology such as biological process, molecular function, and cellular component. Here, we attempted to apply MeSH (Medical Subject Headings) to interpret groups of genes from biological viewpoint. To assign MeSH terms to genes, in this study, contexts associated with genes are retrieved from full set of MEDLINE data using machine learning, and then extracted MeSH terms from retrieved articles. Utilizing the developed method, we implemented a software called BioCompass. It generates high-scoring lists and hierarchical lists for diseases MeSH terms associated with groups of genes to utilize MeSH and GO tree, and illustrated a wiring diagram by linking genes with extracted association from articles. Researchers can easily retrieve genes and keywords of interest, such as diseases and drugs, associated with groups of genes. Using retrieved MeSH terms and OMIM in conjunction with, we could obtain more disease information associated with target gene. BioCompass helps researchers to interpret groups of genes such as microarray data from a biological viewpoint.


Keywords: MeSH terms, Gene Ontology, MEDLINE, text mining, OMIM, machine learning, microarray