Incremental aggregation on multiple continuous queries

Authors:
Chun Jin;Jaime Carbonell
Affiliations:
Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Year:
2006

Citing 19
Cited 2

Updating derived relations: detecting irrelevant and autonomously computable updates

ACM Transactions on Database Systems (TODS)
Answering queries using views (extended abstract)

PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
On the complexity of generating optimal plans with cross products (extended abstract)

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
NiagaraCQ: a scalable continuous query system for Internet databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
On the Multiple-Query Optimization Problem

IEEE Transactions on Knowledge and Data Engineering
Data Integration using Self-Maintainable Views

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Adaptive filters for continuous queries over distributed data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Holistic aggregates in a networked world: distributed tracking of approximate quantiles

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Stacked indexed views in microsoft SQL server

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Efficient computation of multiple group by queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Semantics and evaluation techniques for window aggregates in data streams

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mining periodic patterns with gap requirement from sequences

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
ARGUS: rete + DBMS = efficient persistent profile matching on large-volume data streams

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems

Predicate indexing for incremental multi-query optimization

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Aggregating and disaggregating flexibility objects

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.01

Visualization

Abstract

Continuously monitoring large-scale aggregates over data streams is important for many stream processing applications, e.g. collaborative intelligence analysis, and presents new challenges to data management systems. The first challenge is to efficiently generate the updated aggregate values and provide the new results to users after new tuples arrive. We implemented an incremental aggregation mechanism for doing so for arbitrary algebraic aggregate functions including user-defined ones by keeping up-to-date finite data summaries. The second challenge is to construct shared query evaluation plans to support large-scale queries effectively. Since multiple query optimization is NP-complete and the queries generally arrive asynchronously, we apply an incremental sharing approach to obtain the shared plans that perform reasonably well. The system is built as a part of ARGUS, a stream processing system atop of a DBMS. The evaluation study shows that our approaches are effective and efficient on typical collaborative intelligence analysis data and queries.