Range search on multidimensional uncertain data

  • Authors:
  • Yufei Tao;Xiaokui Xiao;Reynold Cheng

  • Affiliations:
  • Chinese University of Hong Kong, New Territories, Hong Kong;Chinese University of Hong Kong, New Territories, Hong Kong;Hong Kong Polytechnic University, Kowloon, Hong Kong

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In an uncertain database, every object o is associated with a probability density function, which describes the likelihood that o appears at each position in a multidimensional workspace. This article studies two types of range retrieval fundamental to many analytical tasks. Specifically, a nonfuzzy query returns all the objects that appear in a search region rq with at least a certain probability tq. On the other hand, given an uncertain object q, fuzzy search retrieves the set of objects that are within distance ϵq from q with no less than probability tq. The core of our methodology is a novel concept of “probabilistically constrained rectangle”, which permits effective pruning/validation of nonqualifying/qualifying data. We develop a new index structure called the U-tree for minimizing the query overhead. Our algorithmic findings are accompanied with a thorough theoretical analysis, which reveals valuable insight into the problem characteristics, and mathematically confirms the efficiency of our solutions. We verify the effectiveness of the proposed techniques with extensive experiments.