BioMap: toward the development of a knowledge base of biomedical literature

  • Authors:
  • Kamal Kumar;Mathew J. Palakal;Snehasis Mukhopadhyay;Mathew J. Stephens;Huian Li

  • Affiliations:
  • Indiana University Purdue University, Indianapolis, IN;Indiana University Purdue University, Indianapolis, IN;Indiana University Purdue University, Indianapolis, IN;Indiana University School of Medicine, Indianapolis, IN;Indiana University Purdue University, Indianapolis, IN

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Biological literature databases continue to grow rapidly with vital information that is important for conducting sound biomedical research. As data and information space continue to grow exponentially, the need for rapidly surveying the published literature, synthesizing, and discovering the embedded "knowledge" is becoming critical to allow the researchers to conduct "informed" work, avoid repetition, and generate new hypotheses. Knowledge, in this case, is defined as one-to-many and many-to-many relationships among biological entities such as gene, protein, drug, disease, etc. The knowledge discovery process basically involves identification of biological object names, reference resolution, ontology and synonym discovery, and finally extracting object-object relationships. The overall goal of this work is to investigate and develop a complete knowledge base, called BioMap, using the entire MEDLINE collection of (over 12 million) bibliographic citations and author abstracts from over 4600 biomedical journals worldwide and to develop an interactive knowledge network for users to access this secondary knowledge (BioMap) along with its primary databases such as the MEDLINE. In this paper we present the organization of a distributed database system to maintain the knowledge base of BioMap and some preliminary results on biological object name identification problem based on an initial set of 30,000 MEDLINE abstracts.