Approximate membership localization (AML) for web-based join

  • Authors:
  • Zhixu Li;Laurianne Sitbon;Liwei Wang;Xiaofang Zhou;Xiaoyong Du

  • Affiliations:
  • The University of Queensland, Brisbane, Australia;The University of Queensland, Brisbane, Australia;Wuhan University, Wuhan, China;The University of Queensland, Brisbane, Australia;Renmin University of China, Beijing, China

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a search-based approach to join two tables in the absence of clean join attributes. Non-structured documents from the web are used to express the correlations between a given query and a reference list. To implement this approach, a major challenge we meet is how to efficiently determine the number of times and the locations of each clean reference from the reference list that is approximately mentioned in the retrieved documents. We formalize the Approximate Membership Localization (AML) problem and propose an efficient partial pruning algorithm to solve it. A study using real-word data sets demonstrates the effectiveness of our search-based approach, and the efficiency of our AML algorithm.