Dynamic structures for top-k queries on uncertain data

Authors:
Jiang Chen;Ke Yi
Affiliations:
Center for Computational Learning Systems, Columbia University, New York, NY;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
Venue:
ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Year:
2007

Citing 10
Cited 4

Making data structures persistent

Journal of Computer and System Sciences - 18th Annual ACM Symposium on Theory of Computing (STOC), May 28-30, 1986
Declarative Data Cleaning: Language, Model, and Algorithms

Proceedings of the 27th International Conference on Very Large Data Bases
Robust and efficient fuzzy match for online data cleaning

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Data integration: the teenage years

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Eliminating fuzzy duplicates in data warehouses

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

A dynamic data structure for top-k queries on uncertain data

Theoretical Computer Science
Sliding-window top-k queries on uncertain streams

Proceedings of the VLDB Endowment
Sliding-window top-k queries on uncertain streams

The VLDB Journal — The International Journal on Very Large Data Bases
A truly dynamic data structure for top-k queries on uncertain data

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In an uncertain data set S = (S, p, f) where S is the ground set consisting of n elements, p : S → [0, 1] a probability function, and f : S → R a score function, each element i ∈ S with score f(i) appears independently with probability p(i). The top-k query on S asks for the set of k elements that has the maximum probability of appearing to be the k elements with the highest scores in a random instance of S. Computing the top-k answer on a fixed S is known to be easy. In this paper, we consider the dynamic problem, that is, how to maintain the top-k query answer when S changes, including element insertion and deletions in the ground set S, changes in the probability function p and the score function f. We present a fully dynamic data structure that handles an update in O(k log k log n) time, and answers a top-j query in O(log n+j) time for any j ≤ k. The structure has O(n) size and can be constructed in O(n log2 k) time. As a building block of our dynamic structure, we present an algorithm for the all-top-k problem, that is, computing the top-j answers for all j = 1, ..., k, which may be of independent interest.