Query Planning for Searching Inter-dependent Deep-Web Databases
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
Exploiting the complex maze of publicly available Biological resources to implement scientific data collection pipelines poses a multitude of challenges to biologists in accurately reflecting the scientific question at hand and in the selection of the best resources which satisfy their needs. We extended our BioNavigation system to address these challenges and aid the scientists visualize and navigate the resources, express their queries and determine the most suitable set of resources to evaluate them. For this purpose, we use an ontology that describes the higher logical level of scientific concepts and their relationships. A user can browse and visualize this ontology and then graphically select the relevant nodes and edges to build his query. We developed the ESearch algorithm that searches the physical level of resources to generate paths that express the ontological query. The algorithm also ranks the paths based on three semantic metrics; target object cardinality - to optimize the number of records in the output dataset, path cardinality - to optimize the number of links between the involved data sources, and evaluation cost - to minimize the cost that will be incurred to execute that evaluation path. These metrics allow the user to select the most optimum path that matches his requirements.