Dynamic structures for top-k queries on uncertain data

  • Authors:
  • Jiang Chen;Ke Yi

  • Affiliations:
  • Center for Computational Learning Systems, Columbia University, New York, NY;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong

  • Venue:
  • ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In an uncertain data set S = (S, p, f) where S is the ground set consisting of n elements, p : S → [0, 1] a probability function, and f : S → R a score function, each element i ∈ S with score f(i) appears independently with probability p(i). The top-k query on S asks for the set of k elements that has the maximum probability of appearing to be the k elements with the highest scores in a random instance of S. Computing the top-k answer on a fixed S is known to be easy. In this paper, we consider the dynamic problem, that is, how to maintain the top-k query answer when S changes, including element insertion and deletions in the ground set S, changes in the probability function p and the score function f. We present a fully dynamic data structure that handles an update in O(k log k log n) time, and answers a top-j query in O(log n+j) time for any j ≤ k. The structure has O(n) size and can be constructed in O(n log2 k) time. As a building block of our dynamic structure, we present an algorithm for the all-top-k problem, that is, computing the top-j answers for all j = 1, ..., k, which may be of independent interest.