Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelets for computer graphics: theory and applications
Wavelets for computer graphics: theory and applications
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
IEEE Computational Science & Engineering
Approximate query processing using wavelets
The VLDB Journal — The International Journal on Very Large Data Bases
A survey on wavelet applications in data mining
ACM SIGKDD Explorations Newsletter
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive, hands-off stream mining
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XWAVE: optimal and approximate extended wavelets
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Compact histograms for hierarchical identifiers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient range-constrained similarity search on wavelet synopses over multiple streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Extended wavelets for multiple measures
ACM Transactions on Database Systems (TODS)
Exploiting duality in summarization with deterministic guarantees
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
Time series compressibility and privacy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the VLDB Endowment
PROUD: a probabilistic approach to processing similarity queries over uncertain data streams
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Unrestricted wavelet synopses under maximum error bound
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
On Multidimensional Wavelet Synopses for Maximum Error Bounds
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Hierarchically compressed wavelet synopses
The VLDB Journal — The International Journal on Very Large Data Bases
AMID: Approximation of MultI-measured Data using SVD
Information Sciences: an International Journal
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
A wavelet transform for efficient consolidation of sensor relations with quality guarantees
Proceedings of the VLDB Endowment
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Building data synopses within a known maximum error bound
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Flexible selection of wavelet coefficients based on the estimation error of predefined queries
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Flexible selection of wavelet coefficients for continuous data stream reduction
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Effective processing of continuous group-by aggregate queries in sensor networks
Journal of Systems and Software
Scalable kNN search on vertically stored time series
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Constructing optimal wavelet synopses
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Data stream synopsis using saintetiq
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Wavelet synopsis: setting unselected coefficients to zero is not optimal
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
We study the problem of computing wavelet-based synopses for massive data sets in static and streaming environments. A compact representation of a data set is obtained after a thresholding process is applied on the coefficients of its wavelet decomposition. Existing polynomial-time thresholding schemes that minimize maximum error metrics are disadvantaged by impracticable time and space complexities and are not applicable in a data stream context. This is a cardinal issue, as the problem at hand in its most practically interesting form involves the time-efficient approximation of huge amounts of data, potentially in a streaming environment. In this paper we fill this gap by developing efficient and practicable wavelet thresholding algorithms for maximum-error metrics, for both a static and a streaming case. Our algorithms achieve near-optimal accuracy and superior runtime performance, as our experiments show, under frugal space requirements in both contexts.