Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The Aqua approximate query answering system
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal and approximate computation of summary statistics for range aggregates
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Fast algorithms for hierarchical range histogram construction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Locally adaptive dimensionality reduction for indexing large time series databases
ACM Transactions on Database Systems (TODS)
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Histogramming Data Streams with Fast Per-Item Processing
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Approximating a Data Stream for Querying and Estimation: Algorithms and Performance Evaluation
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Random Sampling for Continuous Streams with Arbitrary Updates
IEEE Transactions on Knowledge and Data Engineering
Error minimization in approximate range aggregates
Data & Knowledge Engineering
Quality-Aware Sampling and Its Applications in Incremental Data Mining
IEEE Transactions on Knowledge and Data Engineering
A Note on Linear Time Algorithms for Maximum Error Histograms
IEEE Transactions on Knowledge and Data Engineering
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting duality in summarization with deterministic guarantees
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
FluxCapacitor: efficient time-travel text search
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Wavelet synopsis for hierarchical range queries with workloads
The VLDB Journal — The International Journal on Very Large Data Bases
High performance multivariate visual data exploration for extremely large data
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Probabilistic Framework for Building Privacy-Preserving Synopses of Multi-dimensional Data
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
The VLDB Journal — The International Journal on Very Large Data Bases
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
A new approach to building histogram for selectivity estimation in query processing optimization
Computers & Mathematics with Applications
Tight results for clustering and summarizing data streams
Proceedings of the 12th International Conference on Database Theory
Unrestricted wavelet synopses under maximum error bound
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Preface: an overview on learning from data streams
New Generation Computing
Learning from Data Streams: Synopsis and Change Detection
Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
Continuous Spatial Authentication
SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic histograms for probabilistic data
Proceedings of the VLDB Endowment
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Approximating Points by a Piecewise Linear Function: I
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Approximating Points by a Piecewise Linear Function: II. Dealing with Outliers
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Building data synopses within a known maximum error bound
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Continuous authentication on relational streams
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Monitoring incremental histogram distribution for change detection in data streams
Sensor-KDD'08 Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
Outlier respecting points approximation
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Histograms as statistical estimators for aggregate queries
Information Systems
RFID-data compression for supporting aggregate queries
ACM Transactions on Database Systems (TODS)
Data & Knowledge Engineering
Hi-index | 0.00 |
Histograms and Wavelet synopses provide useful tools in query optimization and approximate query answering. Traditional histogram construction algorithms, such as V-Optimal, optimize absolute error measures for which the error in estimating a true value of 10 by 20 has the same effect of estimating a true value of 1000 by 1010. However, several researchers have recently pointed out the drawbacks of such schemes and proposed wavelet based schemes to minimize relative error measures. None of these schemes provide satisfactory guarantees - and we provide evidence that the difficulty may lie in the choice of wavelets as the representation scheme. In this paper, we consider histogram construction for the known relative error measures. We develop optimal as well as fast approximation algorithms. We provide a comprehensive theoretical analysis and demonstrate the effectiveness of these algorithms in providing significantly more accurate answers through synthetic and real life data sets.