Approximate content summary for database selection in deep web data integration

  • Authors:
  • Fangjiao Jiang;Yukun Li;Jiping Zhao;Nan Yang

  • Affiliations:
  • Institute of Intelligent Information Processing, Xuzhou Normal University, Jiangsu, China;School of Information, Renmin University of China;Institute of Intelligent Information Processing, Xuzhou Normal University, Jiangsu, China;School of Information, Renmin Universityof China

  • Venue:
  • WAIM'10 Proceedings of the 2010 international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Deep Web data integration, the metaquerier provides a unified interface for each domain, which can dispatch the user query to the most relevant Web databases. Traditional database selection algorithms are often based on content summaries. However, many web-accessible databases are uncooperative. The only way of accessing the contents of these databases is via querying. In this paper, we propose an approximate content summary approach for database selection. Furthermore, the real-life databases are not always static and, accordingly, the statistical content summary needs to be updated periodically to reflect database content changes. Therefore, we also propose a survival function approach to give appropriate schedule to regenerate approximate content summary. We conduct extensive experiments to illustrate the accuracy and efficiency of our techniques.