An optimal and progressive algorithm for skyline queries

  • Authors:
  • Dimitris Papadias;Yufei Tao;Greg Fu;Bernhard Seeger

  • Affiliations:
  • Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Carnegie Mellon University, Pittsburgh;Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Philipps-University Marburg, Marburg, Germany

  • Venue:
  • Proceedings of the 2003 ACM SIGMOD international conference on Management of data
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The skyline of a set of d-dimensional points contains the points that are not dominated by any other point on all dimensions. Skyline computation has recently received considerable attention in the database community, especially for progressive (or online) algorithms that can quickly return the first skyline points without having to read the entire data file. Currently, the most efficient algorithm is NN (nearest neighbors), which applies the divide -and-conquer framework on datasets indexed by R-trees. Although NN has some desirable features (such as high speed for returning the initial skyline points, applicability to arbitrary data distributions and dimensions), it also presents several inherent disadvantages (need for duplicate elimination if d2, multiple accesses of the same node, large space overhead). In this paper we develop BBS (branch-and-bound skyline), a progressive algorithm also based on nearest neighbor search, which is IO optimal, i.e., it performs a single access only to those R-tree nodes that may contain skyline points. Furthermore, it does not retrieve duplicates and its space overhead is significantly smaller than that of NN. Finally, BBS is simple to implement and can be efficiently applied to a variety of alternative skyline queries. An analytical and experimental comparison shows that BBS outperforms NN (usually by orders of magnitude) under all problem instances.