Probabilistic skyline queries

  • Authors:
  • Christian Böhm;Frank Fiedler;Annahita Oswald;Claudia Plant;Bianca Wackersreuther

  • Affiliations:
  • University of Munich, Munich, Germany;University of Munich, Munich, Germany;University of Munich, Munich, Germany;Technische Universität München, Munich, Germany;University of Munich, Munich, Germany

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability to deal with uncertain information is becoming increasingly important for modern database applications. Whereas a conventional (certain) object is usually represented by a vector from a multidimensional feature space, an uncertain object is represented by a multivariate probability density function (PDF). This PDF can be defined either discretely (e.g. by a histogram) or continuously in parametric form (e.g. by a Gaussian Mixture Model). For a database of uncertain objects, the users expect similar data analysis techniques as for a conventional database of certain objects. An important analysis technique for certain objects is the skyline operator which finds maximal or minimal vectors with respect to any possible attribute weighting. In this paper, we propose the concept of probabilistic skylines, an extension of the skyline operator for uncertain objects. In addition, we propose efficient and effective methods for determining the probabilistic skyline of uncertain objects which are defined by a PDF in parametric form (e.g. a Gaussian function or a Gaussian Mixture Model). To further accelerate the search, we elaborate how the computation of the probabilistic skyline can be supported by an index structure for uncertain objects. An extensive experimental evaluation demonstrates both the effectiveness and the efficiency of our technique.