Volume 5


In Silico Biology 5, 0003 (2004); ©2004, Bioinformation Systems e.V.  

PRIME: automatically extracted PRotein Interactions and Molecular Information databasE

Asako Koike1, 2,* and Toshihisa Takagi1

1 Dept. of Computational Biology, Graduate School of Frontier Science, The University of Tokyo, Kiban-3A1(CB01) 5-1-5, Kashiwanoha Kashiwa, Chiba, 277-8561, Japan
2 Central Research Laboratory, Hitachi Ltd. 1-280 Higashi-koigakubo Kokubunji city, Tokyo, 185-8601, Japan

* Corresponding author; Email:; Phone: +81-4-7136 3982; Fax: +81-4-7136 3975

Edited by H. Michael; received September 23, 2004; revised and accepted November 25, 2004; published December 22, 2004


With the exponentially increasing amount of information in the biomedical field, the significance of advanced information retrieval and information extraction, as well as the role of databases, has been increasing. PRIME is an integrated gene/protein informatics database based on natural language processing. It provides automatically extracted protein/family/gene/compound interaction information including both physical and genetic interactions, gene ontology based functions, and graphic pathway viewers. Gene/protein/family names and functional terms are recognized based on dictionaries developed in our laboratory. The interaction and functional information are extracted by syntactic dependencies and various phrase patterns. We have included about 920,000 (non-redundant) protein interactions and 360,000 annotated gene-function relationships for major eukaryotes. By combining the sequence and text information, the pathway comparison between two organisms and simple pathway deduction based on other organism interaction data, and pathway filtering using tissue expression data, are also available. This database is accessible at

Keywords: protein interaction, biological process, pathway database, natural language processing