A dynamic data structure for top-k queries on uncertain data

  • Authors:
  • Jiang Chen;Ke Yi

  • Affiliations:
  • Yahoo! Inc., 701 First Avenue, Sunnyvale, CA 94089, USA;Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2008

Quantified Score

Hi-index 5.23

Visualization

Abstract

In an uncertain data setS=(S,p,f) where S is the ground set consisting of n elements, p:S-[0,1] a probability function, and f:S-R a score function, each element i@?S with score f(i) appears independently with probability p(i). The top-k query on S asks for the set of k elements that has the maximum probability of appearing to be the k elements with the highest scores in a random instance of S. Computing the top-k answer on a fixed S is known to be easy. In this paper, we consider the dynamic problem, that is, how to maintain the top-k query answer when S changes, including element insertions and deletions in the ground set S, changes in the probability function p and in the score function f. We present a fully dynamic data structure that handles an update in O(klogn) time, and answers a top-j query in O(logn+j) time for any j@?k. The structure has O(n) size and can be constructed in O(nlogk) time. As a building block of our dynamic structure, we present an algorithm for the all-top-k problem, that is, computing the top-j answers for all j=1,...,k, which may be of independent interest.