Probabilistic Ranking Queries on Gaussians

Authors:
Christian Bohm;Alexey Pryakhin;Matthias Schubert
Affiliations:
University of Munich, Germany;University of Munich, Germany;University of Munich, Germany
Venue:
SSDBM '06 Proceedings of the 18th International Conference on Scientific and Statistical Database Management
Year:
2006

Citing 0
Cited 7

ProUD: Probabilistic Ranking in Uncertain Databases

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Probabilistic Similarity Search for Uncertain Time Series

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Querying objects modeled by arbitrary probability distributions

SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Similarity search and mining in uncertain databases

Proceedings of the VLDB Endowment
Continuous inverse ranking queries in uncertain streams

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Probabilistic range monitoring of streaming uncertain positions in geosocial networks

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
A probabilistic approach to correlation queries in uncertain time series data

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many modern applications, there are no exact values available to describe the data objects. Instead, the feature values are considered to be uncertain. This uncertainty is modeled by probability distributions instead of exact feature values. A typical application of such an uncertainty model are moving objects where the exact position of each object can be determined only at discrete time intervals. Queries often involve the positions of objects between two such time stamps or after the last known time stamp. Then the objects are essentially uncertain unless the pattern of movement is very simple (e.g. linear). One of the most important probability density functions for those applications is the Gaussian or normal distribution which can be defined by a mean value and a standard deviation. In this paper, we examine a new type of queries on uncertain data objects, called probability ranking queries (PRQ). A PRQ retrieves those k objects which have the highest probability of being located inside a given query area. To speed up probabilistic queries on large sets of uncertain data objects described by Gaussians, we introduce a novel index structure called Gauss-tree. Furthermore, we provide an algorithm for employing the Gauss-tree to answer PRQs. In our experimental evaluation, we demonstrate that the Gauss-tree achieves a considerable efficiency advantage with respect to PRQs compared to other applicable methods.