Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Materialized view maintenance and integrity constraint checking: trading space for time
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
View indexing in relational databases
ACM Transactions on Database Systems (TODS)
Decomposition—a strategy for query processing
ACM Transactions on Database Systems (TODS)
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Common expression analysis in database applications
SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Path sharing and predicate evaluation for high-performance XML filtering
ACM Transactions on Database Systems (TODS)
Nile: A Query Processing Engine for Data Streams
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Multiple aggregations over data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On-the-fly sharing for streamed aggregation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Tribeca: a system for managing large databases of network traffic
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Streaming queries over streaming data
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data stream query processing: a tutorial
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Resource sharing in continuous sliding-window aggregates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Rule-based multi-query optimization
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Optimization of single expressions in a relational data base system
IBM Journal of Research and Development
Towards expressive publish/subscribe systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Regression on evolving multi-relational data streams
Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop
Optimized processing of multiple aggregate continuous queries
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Data streams characterize the high speed and large volume input of a new class of applications such as network monitoring, web content analysis and sensor networks. Among these applications, network monitoring may be the most compelling one--the backbone of a large internet service provider can generate 1 petabyte of data per day. For many network monitoring tasks such as traffic analysis and statistics collection, aggregation is a primitive operation. Various analytical and statistical needs naturally lead to related aggregate queries. In this article, we address the problem of efficiently computing multiple aggregations over high-speed data streams based on the two-level query processing architecture of GS, a real data stream management system deployed in AT & T. We discern that additionally computing and maintaining fine-granularity aggregations (called phantoms) has the benefit of supporting shared computation. Based on a thorough analysis, we propose algorithms to identify the best set of phantoms to maintain and determine allocation of resources (particularly, space) to compute the aggregations. Experiments show that our algorithm achieves near-optimal computation costs, which outperforms the best adapted algorithm by more than an order of magnitude.