On the self-similar nature of Ethernet traffic (extended version)
IEEE/ACM Transactions on Networking (TON)
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Querying and mining data streams: you only get one look a tutorial
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Continuous queries over data streams
ACM SIGMOD Record
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
The optimization of queries in relational databases
The optimization of queries in relational databases
Mercury: supporting scalable multi-attribute range queries
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Fast range query estimation by N-level tree histograms
Data & Knowledge Engineering
Reducing data stream sliding windows by cyclic tree-like histograms
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
A Unified Framework for Monitoring Data Streams in Real Time
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Summarizing and mining inverse distributions on data streams via dynamic inverse sampling
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
ACM SIGMOD Record
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Histogram-by: A grouping operator for continuous domains
Data & Knowledge Engineering
Deterministic sampling and range counting in geometric data streams
ACM Transactions on Algorithms (TALG)
Processing partially specified queries over high-dimensional databases
Data & Knowledge Engineering
Continuous Nearest Neighbor Queries over Sliding Windows
IEEE Transactions on Knowledge and Data Engineering
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Selectivity estimation of range queries based on data density approximation via cosine series
Data & Knowledge Engineering
XWAVE: optimal and approximate extended wavelets
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Deterministic algorithms for sampling count data
Data & Knowledge Engineering
Incremental Maintenance of Online Summaries Over Multiple Streams
IEEE Transactions on Knowledge and Data Engineering
Sampling time-based sliding windows in bounded space
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
Multi-query optimization for sketch-based estimation
Information Systems
Mining non-derivable frequent itemsets over data stream
Data & Knowledge Engineering
Frequency Estimation over Sliding Windows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Histograms and Wavelets on Probabilistic Data
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Optimal sampling from sliding windows
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Maintaining time-decaying stream aggregates
Journal of Algorithms
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
On the testing for alpha-stable distributions of network traffic
Computer Communications
Local area network characteristics, with implications for broadband network congestion management
IEEE Journal on Selected Areas in Communications
Adaptive optimization for multiple continuous queries
Data & Knowledge Engineering
Identifying streaming frequent items in ad hoc time windows
Data & Knowledge Engineering
A hierarchical semantic-based distance for nominal histogram comparison
Data & Knowledge Engineering
Hi-index | 0.00 |
The issue of providing fast approximate answers to range queries on sliding windows with a small consumption of storage space is one of the main challenges in the context of data streams. On the one hand, the importance of this class of queries is widely accepted. They are indeed useful to compute aggregate information over the data stream, allowing us to extract from it more abstract knowledge than point queries. On the other hand, the usage of techniques like synopses based on histograms, sketches, sampling, and so on, makes effective those approaches which require multiple scans on data, which otherwise would be prohibitive from the computational point of view. Among the above techniques, histogram-based approaches are considered one of the most advantageous solutions, at least in case of range queries. It is a matter of fact that histograms show a very good capability of summarizing data preserving quick and accurate answers to range queries. In this paper, we propose a novel histogram-based technique to reduce sliding windows supporting approximate arbitrary range-sum queries. Our histogram, relying on a tree-based structure, is suitable to directly support hierarchical queries and, thus, drill-down and roll-up operations. In addition, the structure well supports sliding window shifting and quick query answering, since it operates in logarithmic time in the sliding window size. A bit-saving approach to encoding tree nodes allows us to compress the sliding window with a little price in terms of accuracy. The contribution of this work is thus not only the proposal of a new specific technique to tackle an important problem but also a deep analysis of the advantages given by the hierarchical approach combined with the bit-saving strategy. A careful experimental analysis validates the method showing its superiority w.r.t. the state of the art.