A model for hierarchical memory
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Ten lectures on wavelets
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Approximation algorithms for NP-hard problems
Approximation algorithms for NP-hard problems
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Aqua approximate query answering system
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
A linear space algorithm for computing maximal common subsequences
Communications of the ACM
The working set model for program behavior
Communications of the ACM
Optimal and approximate computation of summary statistics for range aggregates
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Approximation algorithms
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Fast algorithms for hierarchical range histogram construction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Locally adaptive dimensionality reduction for indexing large time series databases
ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
What can Hierarchies do for Data Warehouses?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Histogramming Data Streams with Fast Per-Item Processing
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Extended wavelets for multiple measures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Approximating a Data Stream for Querying and Estimation: Algorithms and Performance Evaluation
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
The optimization of queries in relational databases
The optimization of queries in relational databases
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
An evaluation of buffer management strategies for relational database systems
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XWAVE: optimal and approximate extended wavelets
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Pricing guidance in ad sale negotiations: the PrintAds example
Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Approximating Points by a Piecewise Linear Function: I
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Approximating Points by a Piecewise Linear Function: II. Dealing with Outliers
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Information Sciences: an International Journal
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Hi-index | 0.00 |
Synopses construction algorithms have been found to be of interest in query optimization, approximate query answering and mining, and over the last few years several good synopsis construction algorithms have been proposed. These algorithms have mostly focused on the running time of the synopsis construction vis-a-vis the synopsis quality. However the space complexity of synopsis construction algorithms has not been investigated as thoroughly. Many of the optimum synopsis construction algorithms are expensive in space. For some of these algorithms the space required to construct the synopsis is significantly larger than the space required to store the input. These algorithms rely on the fact that they require a smaller "working space" and most of the data can be resident on disc. The large space complexity of synopsis construction algorithms is a handicap in several scenarios. In the case of streaming algorithms, space is a fundamental constraint. In case of offline optimal or approximate algorithms, a better space complexity often makes these algorithms much more attractive by allowing them to run in main memory and not use disc, or alternately allows us to scale to significantly larger problems without running out of space. In this paper, we propose a simple and general technique that reduces space complexity of synopsis construction algorithms. As a consequence we show that the notion of "working space" proposed in these contexts is redundant. This technique can be easily applied to many existing algorithms for synopsis construction problems. We demonstrate the performance benefits of our proposal through experiments on real-life and synthetic data. We believe that our algorithm also generalizes to a broader range of dynamic programs beyond synopsis construction.