SEEDEEP: A System for Exploring and Querying Scientific Deep Web Data Sources

  • Authors:
  • Fan Wang;Gagan Agrawal

  • Affiliations:
  • Department of Computer Science and Engineering, Ohio State University, Columbus OH 43210;Department of Computer Science and Engineering, Ohio State University, Columbus OH 43210

  • Venue:
  • SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A recent and emerging trend in scientific data dissemination involves online databases that are hidden behind query forms, thus forming what is referred to as the deep web . In this paper, we propose SEEDEEP, a System for Exploring and quErying scientific DEEP web data sources. SEEDEEP is able to automatically mine deep web data source schemas, integrate heterogeneous data sources, answer cross-source keyword queries, and incorporates features like caching and fault-tolerance. Currently, SEEDEEP integrates 16 deep web data sources in the biological domain. We demonstrate how an integrated model for correlated deep web data sources is constructed, how a complex cross-source keyword query is answered efficiently and correctly, and how important performance issues are addressed.