Aggregate Skyline: Analysis for Online Users

  • Authors:
  • Shyam Antony;Ping Wu;Divyakant Agrawal;Amr El Abbadi

  • Affiliations:
  • -;-;-;-

  • Venue:
  • SAINT '09 Proceedings of the 2009 Ninth Annual International Symposium on Applications and the Internet
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aggregation is among the core functionalities of OLAP systems. Frequently, such queries are issued in decision support systems to identify interesting groups of data. In conventional settings, the queries take a long time to compute (hours!) and produce massive result-sets at varying degrees of aggregation. Providing real time analysis results to web users can enhance the utility of sites dealing with large amounts of data. However, to do so, needs succinct ways of capturing interesting analysis results rather than complex offline analysis. The result set should be presentable in a few web pages. Furthermore, such results should be computed quickly and updated in the background whenever possible. We propose skyline queries over aggregated data as a means of providing succinct but interesting analysis results. We support aggregation functions from a large class of monotonous functions that can be specified at runtime, thereby allowing user customization of the analysis. We explore a family of algorithms which try to consume only as many data records as are necessary to compute the skyline and identify an optimal algorithm within the family. We further refine the algorithm by taking into account system issues such as disk behavior which are often ignored but have strong impact on real system performance. Experimental results provide strong validation for the performance and progressive nature of the algorithm.