Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The Aqua approximate query answering system
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Applying the golden rule of sampling for query estimation
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Continuous queries over data streams
ACM SIGMOD Record
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Dynamic Maintenance of Wavelet-Based Histograms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
ICALP '97 Proceedings of the 24th International Colloquium on Automata, Languages and Programming
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Fjording the Stream: An Architecture for Queries Over Streaming Sensor Data
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
The optimization of queries in relational databases
The optimization of queries in relational databases
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Correlating XML data streams using tree-edit distance embeddings
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Extended wavelets for multiple measures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Experiments with random projections for machine learning
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The power-method: a comprehensive estimation technique for multi-dimensional queries
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Analysis of predictive spatio-temporal queries
ACM Transactions on Database Systems (TODS)
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
A new histogram method for sparse attributes: the averaged rectangular attribute cardinality map
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Querying about the Past, the Present, and the Future in Spatio-Temporal Databases
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximate Temporal Aggregation
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Spatio-Temporal Aggregation Using Sketches
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Holistic UDAFs at streaming speeds
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Diamond in the rough: finding Hierarchical Heavy Hitters in multi-dimensional data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximation techniques for spatial data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Accuracy vs. lifetime: linear sketches for approximate aggregate range queries in sensor networks
Proceedings of the 2004 joint workshop on Foundations of mobile computing
Fast range query estimation by N-level tree histograms
Data & Knowledge Engineering
Tracking set-expression cardinalities over continuous update streams
The VLDB Journal — The International Journal on Very Large Data Bases
Semantic Approximation of Data Stream Joins
IEEE Transactions on Knowledge and Data Engineering
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Manifold learning visualization of network traffic data
Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data
Fast window correlations over uncooperative time series
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Dynamic histograms for future spatiotemporal range predicates
Information Sciences—Informatics and Computer Science: An International Journal
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Fast range-summable random variables for efficient aggregate estimation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
Journal of Intelligent Information Systems
Data mining middleware for wide-area high-performance networks
Future Generation Computer Systems - IGrid 2005: The global lambda integrated facility
Anatomy: simple and effective privacy preservation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Compressed histograms with arbitrary bucket layouts for selectivity estimation
Information Sciences: an International Journal
Pseudo-random number generation for sketch-based estimations
ACM Transactions on Database Systems (TODS)
Selectivity estimation by batch-query based histogram and parametric method
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Dissemination of compressed historical information in sensor networks
The VLDB Journal — The International Journal on Very Large Data Bases
Comparing data streams using Hamming norms (how to zero in)
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Distributed top-N query processing with possibly uncooperative local systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Finding hierarchical heavy hitters in streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Proactive and reactive multi-dimensional histogram maintenance for selectivity estimation
Journal of Systems and Software
Explicit constructions for compressed sensing of sparse signals
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Approximate continuous querying over distributed streams
ACM Transactions on Database Systems (TODS)
Collaborative data gathering in wireless sensor networks using measurement co-occurrence
Computer Communications
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in ${\mathbb R}^d$
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Summarizing Two-Dimensional Data with Skyline-Based Statistical Descriptors
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Breaking the Curse of Cardinality on Bitmap Indexes
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Multiple-Objective Compression of Data Cubes in Cooperative OLAP Environments
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
Multi-query optimization for sketch-based estimation
Information Systems
The design of a query monitoring system
ACM Transactions on Database Systems (TODS)
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Multiple pass streaming algorithms for learning mixtures of distributions in Rd
Theoretical Computer Science
A First Step Towards Stream Reasoning
Future Internet --- FIS 2008
Self-tuning management of update-intensive multidimensional data in clusters of workstations
The VLDB Journal — The International Journal on Very Large Data Bases
Statistical structures for Internet-scale data management
The VLDB Journal — The International Journal on Very Large Data Bases
Managing massive time series streams with multi-scale compressed trickles
Proceedings of the VLDB Endowment
Consistent histograms in the presence of distinct value counts
Proceedings of the VLDB Endowment
Computing histograms of local variables for real-time monitoring using aggregation trees
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Dynamic histograms for future spatiotemporal range predicates
Information Sciences: an International Journal
Speeding up clustering-based k-anonymisation algorithms with pre-partitioning
BNCOD'07 Proceedings of the 24th British national conference on Databases
Journal of Intelligent Information Systems
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Efficient construction of histograms for multidimensional data using quad-trees
Decision Support Systems
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Efficient quantile retrieval on multi-dimensional data
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Adaptive spatial partitioning for multidimensional data streams
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Density estimation for spatial data streams
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Streaming algorithms for geometric problems
FSTTCS'04 Proceedings of the 24th international conference on Foundations of Software Technology and Theoretical Computer Science
Approximating and testing k-histogram distributions in sub-linear time
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Rectangle-efficient aggregation in spatial data streams
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Histograms as statistical estimators for aggregate queries
Information Systems
Deriving predicate statistics for logic rules
RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
Sketch-based geometric monitoring of distributed stream queries
Proceedings of the VLDB Endowment
Data & Knowledge Engineering
Hi-index | 0.00 |
Histograms are a concise and flexible way to construct summary structures for large data sets. They have attracted a lot of attention in database research due to their utility in many areas, including query optimization, and approximate query answering. They are also a basic tool for data visualization and analysis.In this paper, we present a formal study of dynamic multidimensional histogram structures over continuous data streams. At the heart of our proposal is the use of a dynamic summary data structure (vastly different from a histogram) maintaining a succinct approximation of the data distribution of the underlying continuous stream. On demand, an accurate histogram is derived from this dynamic data structure. We propose algorithms for extracting such an accurate histogram and we analyze their behavior and tradeoffs. The proposed algorithms are able to provide approximate guarantees about the quality of the estimation of the histograms they extract.We complement our analytical results with a thorough experimental evaluation using real data sets.