Maintaining views incrementally
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Answering queries using views (extended abstract)
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Research problems in data warehousing
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Cubetree: organization of and bulk incremental updates on the data cube
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Maintenance of data cubes and summary tables in a warehouse
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
On-line warehouse view maintenance
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Quasi-cubes: exploiting approximations in multidimensional databases
ACM SIGMOD Record
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Rewriting aggregate queries using views
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Making views self-maintainable for data warehousing
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continuous queries over data streams
ACM SIGMOD Record
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Answering Queries with Aggregation Using Views
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimizing Selections over Datacubes
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
QC-trees: an efficient summary structure for semantic OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Condensed Cube: An Efficient Approach to Reducing Data Cube Size
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Finding hierarchical heavy hitters in data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Answering linear optimization queries with an approximate stream index
Knowledge and Information Systems
Efficient maintenance of basic statistical functions in data warehouses
Decision Support Systems
Hi-index | 0.00 |
In some business applications such as trading management in financial institutions, it is required to accurately answer ad hoc aggregate queries over data streams. Materializing and incrementally maintaining a full data cube or even its compression or approximation over a data stream is often computationally prohibitive. On the other hand, although previous studies proposed approximate methods for continuous aggregate queries, they cannot provide accurate answers. In this paper, we develop a novel prefix aggregate tree (PAT) structure for online warehousing data streams and answering ad hoc aggregate queries. Often, a data stream can be partitioned into the historical segment, which is stored in a traditional data warehouse, and the transient segment, which can be stored in a PAT to answer ad hoc aggregate queries. The size of a PAT is linear in the size of the transient segment, and only one scan of the data stream is needed to create and incrementally maintain a PAT. Although the query answering using PAT costs more than the case of a fully materialized data cube, the query answering time is still kept linear in the size of the transient segment. Our extensive experimental results on both synthetic and real data sets illustrate the efficiency and the scalability of our design.