Continuous probabilistic skyline queries over uncertain data streams

  • Authors:
  • Hui Zhu Su;En Tzu Wang;Arbee L. P. Chen

  • Affiliations:
  • Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C.;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C.;Department of Computer Science, National Chengchi University, Taipei, Taiwan, R.O.C.

  • Venue:
  • DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, some approaches of finding probabilistic skylines on uncertain data have been proposed. In these approaches, a data object is composed of instances, each associated with a probability. The probabilistic skyline is then defined as a set of non-dominated objects with probabilities exceeding or equaling a given threshold. In many applications, data are generated as a form of continuous data streams. Accordingly, we make the first attempt to study a problem of continuously returning probabilistic skylines over uncertain data streams in this paper. Moreover, the sliding window model over data streams is considered here. To avoid recomputing the probability of being not dominated for each uncertain object according to the instances contained in the current window, our main idea is to estimate the bounds of these probabilities for early determining which objects can be pruned or returned as results. We first propose a basic algorithm adapted from an existing approach of answering skyline queries on static and certain data, which updates these bounds by repeatedly processing instances of each object. Then, we design a novel data structure to keep dominance relation between some instances for rapidly tightening these bounds, and propose a progressive algorithm based on this new structure. Moreover, these two algorithms are also adapted to solve the problem of continuously maintaining top-k probabilistic skylines. Finally, a set of experiments are performed to evaluate these algorithms, and the experiment results reveal that the progressive algorithm much outperforms the basic one, directly demonstrating the effectiveness of our newly designed structure.