Efficient Top-k Data Sources Ranking for Query on Deep Web

  • Authors:
  • Derong Shen;Meifang Li;Ge Yu;Yue Kou;Tiezheng Nie

  • Affiliations:
  • Northeastern University, Shenyang, China 110004;Northeastern University, Shenyang, China 110004;Northeastern University, Shenyang, China 110004;Northeastern University, Shenyang, China 110004;Northeastern University, Shenyang, China 110004

  • Venue:
  • WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient Query processing on deep web has been gaining great importance due to large amount of deep web data sources. Nevertheless, how to discover the most relevant data sources on deep web is still a challenging issue. Inspired by observations on deep web, the paper presents a novel top-k ranking strategy to rank relevant data sources according to user's requirement. First, it applies an attribute based dominant pattern growth (ADP-growth) algorithm to mine the most dominant attributes, and then employs a top-k style ranking algorithm on those attributes to exploit the most relevant data sources with candidate pruning and early termination, which considers the probability of result merging. Further, it improves the algorithm by incorporating relevant attributes based searching strategy to find the data sources, which has been proved of higher efficiency. We have conducted extensive experiments on a real world dataset and demonstrated the efficiency and effectiveness of our approach.