Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods

Authors:
Bin Jiang;Jian Pei;Xuemin Lin;Yidong Yuan
Affiliations:
School of Computing Science, Simon Fraser University, Burnaby, Canada;School of Computing Science, Simon Fraser University, Burnaby, Canada;School of Computer Science and Engineering, The University of New South Wales and NICTA, Sydney, Australia;School of Computer Science and Engineering, The University of New South Wales and NICTA, Sydney, Australia
Venue:
Journal of Intelligent Information Systems
Year:
2012

Citing 50
Cited 0

Incomplete Information in Relational Databases

Journal of the ACM (JACM)
On the representation and querying of sets of possible worlds

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
On Finding the Maxima of a Set of Vectors

Journal of the ACM (JACM)
On the Average Number of Maxima in a Set of Vectors and Applications

Journal of the ACM (JACM)
Multidimensional binary search trees used for associative searching

Communications of the ACM
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Efficient Progressive Skyline Computation

Proceedings of the 27th International Conference on Very Large Data Bases
An optimal and progressive algorithm for skyline queries

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stratified computation of skylines with partially-ordered domains

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Maximal vector computation in large data sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient computation of the skyline cube

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Catching the best views of skyline: a semantic approach based on decisive subspaces

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Indexing multi-dimensional uncertain data with arbitrary probability density functions

VLDB '05 Proceedings of the 31st international conference on Very large data bases
OLAP over uncertain and imprecise data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Maintaining Sliding Window Skylines on Data Streams

IEEE Transactions on Knowledge and Data Engineering
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
SUBSKY: Efficient Computation of Skylines in Subspaces

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Skyline Queries Against Mobile Lightweight Devices in MANETs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient Continuous Skyline Computation

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Refreshing the sky: the compressed skycube with efficient support for frequent updates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Finding k-dominant skylines in high dimensional space

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
The spatial skyline queries

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Management of probabilistic data: foundations and challenges

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining favorable facets

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Shooting stars in the sky: an online algorithm for skyline queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient indexing methods for probabilistic threshold queries over uncertain data

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient computation of reverse skyline queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic graphical models and their role in databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Representing Tuple and Attribute Uncertainty in Probabilistic Databases

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Dynamic skyline queries in metric spaces

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Monochromatic and bichromatic reverse skyline search over uncertain databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Mining preferences from superior and inferior examples

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Online Interval Skyline Queries on Time Series

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Probabilistic Skyline Operator over Sliding Windows

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Parallel Skyline Computation on Multicore Architectures

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Topologically Sorted Skylines for Partially Ordered Domains

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Computing all skyline probabilities for uncertain data

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimizing the communication cost for continuous skyline maintenance

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Kernel-based skyline cardinality estimation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Probabilistic skyline queries

Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic similarity join on uncertain data

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Parallelizing skyline queries for scalable distribution

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
On high dimensional skylines

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Probabilistic spatial queries on existentially uncertain data

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline, and a p-skyline contains all objects whose skyline probabilities are at least p (0驴p驴驴驴1). Computing probabilistic skylines on large uncertain data sets is challenging. We develop a bounding-pruning-refining framework and three algorithms systematically. The bottom-up algorithm computes the skyline probabilities of some selected instances of uncertain objects, and uses those instances to prune other instances and uncertain objects effectively. The top-down algorithm recursively partitions the instances of uncertain objects into subsets, and prunes subsets and objects aggressively. Combining the advantages of the bottom-up algorithm and the top-down algorithm, we develop a hybrid algorithm to further improve the performance. Our experimental results on both the real NBA player data set and the benchmark synthetic data sets show that probabilistic skylines are interesting and useful, and our algorithms are efficient on large data sets.