Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Practical algorithms for self scaling histograms or better than average data collection
Performance Evaluation
Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining Stream Statistics over Sliding Windows
SIAM Journal on Computing
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space- and time-efficient deterministic algorithms for biased quantiles over data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A Fast Algorithm for Approximate Quantiles in High Speed Data Streams
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
Fast computation of approximate biased histograms on sliding windows over data streams
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
Equi-depth histograms represent a fundamental synopsis widely used in both database and data stream applications, as they provide the cornerstone of many techniques such as query optimization, approximate query answering, distribution fitting, and parallel database partitioning. Equi-depth histograms try to partition a sequence of data in a way that every part has the same number of data items. In this paper, we present a new algorithm to estimate equi-depth histograms for high speed data streams over sliding windows. While many previous methods were based on quantile computations, we propose a new method called BAr Splitting Histogram (BASH) that provides an expected ε-approximate solution to compute the equi-depth histogram. Extensive experiments show that BASH is at least four times faster than one of the best existing approaches, while achieving similar or better accuracy and in some cases using less memory. The experimental results also indicate that BASH is more stable on data affected by frequent concept shifts.