Effective Skyline Cardinality Estimation on Data Streams

  • Authors:
  • Yang Lu;Jiakui Zhao;Lijun Chen;Bin Cui;Dongqing Yang

  • Affiliations:
  • Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, China;Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, China;Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, China;Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, China;Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, China

  • Venue:
  • DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to incorporate the skyline operator into the data stream engine, we need to address the problem of skyline cardinality estimation, which is very important for extending the query optimizer's cost model to accommodate skyline queries. In this paper, we propose robust approaches for estimating the skyline cardinality over sliding windows in the stream environment. We first design an approach to estimate the skyline cardinality over uniformly distributed data, and then extend the approach to support arbitrarily distributed data. Our approaches allow arbitrary data distribution, hence can be applied to extend the optimizer's cost model. To estimate the skyline cardinality in online manner, the live elements in the sliding window are sketched using Spectral Bloom Filters which can efficiently and effectively capture the information which is essential for estimating the skyline cardinality over sliding windows. Extensive experimental study demonstrates that our approaches significantly outperform previous approaches.