The Skyline of a Probabilistic Relation

  • Authors:
  • Ilaria Bartolini;Paolo Ciaccia;Marco Patella

  • Affiliations:
  • Università di Bologna, Bologna;Università di Bologna, Bologna;Università di Bologna, Bologna

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a deterministic relation $(R)$, tuple $(u)$ dominates tuple $(v)$ if $(u)$ is no worse than $(v)$ on all the attributes of interest, and better than $(v)$ on at least one attribute. This concept is at the heart of skyline queries, that return the set of undominated tuples in $(R)$. In this paper, we extend the notion of skyline to probabilistic relations by generalizing to this context the definition of tuple domination. Our approach is parametric in the semantics for linearly ranking probabilistic tuples and, being it based on order-theoretic principles, preserves the three fundamental properties the skyline has in the deterministic case: 1) It equals the union of all top-1 results of monotone scoring functions; 2) it requires no additional parameter; and 3) it is insensitive to actual attribute scales. We then show how domination among probabilistic tuples (or P-domination for short) can be efficiently checked by means of a set of rules. We detail such rules for the cases in which tuples are ranked using either the “expected rank” or the “expected score” semantics, and explain how the approach can be applied to other semantics as well. Since computing the skyline of a probabilistic relation is a time-consuming task, we introduce a family of algorithms for checking P-domination rules in an optimized way. Experiments show that these algorithms can significantly reduce the actual execution times with respect to a naïve evaluation.