XWAVE: optimal and approximate extended wavelets

Authors:
Sudipto Guha;Chulyun Kim;Kyuseok Shim
Affiliations:
University of Pennsylvania;Seoul National University;Seoul National University
Venue:
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Year:
2004

Citing 17
Cited 21

Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelets for computer graphics: theory and applications

Wavelets for computer graphics: theory and applications
Data cube approximation and histograms via wavelets

Proceedings of the seventh international conference on Information and knowledge management
Approximate computation of multidimensional aggregates of sparse data using wavelets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
WALRUS: a similarity retrieval algorithm for image databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Applying the golden rule of sampling for query estimation

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Histogram-Based Approximation of Set-Valued Query-Answers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Dynamic Maintenance of Wavelet-Based Histograms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries

Proceedings of the 27th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Histogramming Data Streams with Fast Per-Item Processing

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Fast Approximate Answers to Aggregate Queries on a Data Cube

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Extended wavelets for multiple measures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data

RPJ: producing fast join results on streams through rate-based optimization

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
BRAID: stream mining through group lag correlations

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Wavelet synopsis for data streams: minimizing non-euclidean error

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
One-pass wavelet synopses for maximum-error metrics

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Streaming pattern discovery in multiple time-series

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Random Sampling for Continuous Streams with Arbitrary Updates

IEEE Transactions on Knowledge and Data Engineering
Extended wavelets for multiple measures

ACM Transactions on Database Systems (TODS)
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error

IEEE Transactions on Knowledge and Data Engineering
Boolean representation based data-adaptive correlation analysis over time series streams

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
On the space---time of optimal, approximate and streaming algorithms for synopsis construction problems

The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive correlation analysis in stream time series with sliding windows

Computers & Mathematics with Applications
Multiplicative synopses for relative-error metrics

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Hierarchically compressed wavelet synopses

The VLDB Journal — The International Journal on Very Large Data Bases
AMID: Approximation of MultI-measured Data using SVD

Information Sciences: an International Journal
Measuring evolving data streams' behavior through their intrinsic dimension

New Generation Computing
Approximating sliding windows by cyclic tree-like histograms for efficient range queries

Data & Knowledge Engineering
Beyond simple aggregates: indexing for summary queries

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining correlations between multi-streams based on Haar wavelet

ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
Multivariate stream data reduction in sensor network applications

EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Density estimation for spatial data streams

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Foundations and Trends in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wavelet synopses have been found to be of interest in query optimization and approximate query answering. Recently, extended wavelets were proposed by Deligiannakis and Roussopoulos for data sets containing multiple measures. Extended wavelets optimize the storage utilization by attempting to store the same wavelet coefficient across different measures. This reduces the bookkeeping overhead and more coefficients can be stored. An optimal algorithm for minimizing the error in representation and an approximation algorithm for the complementary problem was provided. However, both their algorithms take linear space. Synopsis structures are often used in environments where space is at a premium and the data arrives as a continuous stream which is too expensive to store. In this paper, we give algorithms for extended wavelets which are space sensitive, i.e., use space which is dependent on the size of the synopsis (and at most on the logarithm of the total data) and operates in a streaming fashion. We present better optimal algorithms based on dynamic programming and a near optimal approximate greedy algorithm. We also demonstrate the performance benefits of our algorithms compared to previous ones through experiments on real-life and synthetic data sets.