On-the-fly data integration models for biological databases

  • Authors:
  • Pavithra G. Naidu;Mathew J. Palakal;Shielly Hartanto

  • Affiliations:
  • Indiana University Purdue University, Indianapolis, Indiana;Indiana University Purdue University, Indianapolis, IN;Indiana University Purdue University, Indianapolis, IN

  • Venue:
  • Proceedings of the 2007 ACM symposium on Applied computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The web is a universal repository of information where there is an excellent opportunity to exploit the integration of online biological resources for knowledge discovery. A major challenge is to support the effective flow of information among the sources and services on the web and their interconnection with legacy systems that are designed to operate with traditional relational databases. To address this problem, a possible strategy is to combine information from disparate data sources and display it in a single integrated framework to the user without having to populate local databases. This is called online or on-the-fly data integration. BioXBase is a user-centric biological query system which extracts user requested query information over internet from multiple biological sources and organizes a wide variety of information into a homogeneous unified view to the user after data is cleaned, processed and integrated. BioXBase system has improved the results retrieved approximately by 30% compared to a system that has only a local database. The BioXBase system is further enhanced by 20% while combining the results of both BioMap (a local database) and BioXBase (on the fly system), making the results more significant in biological domain. The results were validated by statistical methods such as precision, recall and power-law degree distribution analysis.