SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimization of sequence queries in database systems
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The Design and Implementation of a Sequence Database System
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SRQL: Sorted Relational Query Language
SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Temporal management of RFID data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Warehousing and Analyzing Massive RFID Data Sets
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Flowcube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On synopses for distinct-value estimation under multiset operations
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Hashed samples: selectivity estimators for set similarity selection queries
Proceedings of the VLDB Endowment
I/O-efficient algorithms for answering pattern-based aggregate queries in a sequence OLAP system
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
A Sequence OLAP (S-OLAP) system provides a platform on which pattern-based aggregate (PBA) queries on a sequence database are evaluated. In its simplest form, a PBA query consists of a pattern template T and an aggregate function F. A pattern template is a sequence of variables, each is defined over a domain. For example, the template T = (X,Y ,Y ,X) consists of two variables X and Y . Each variable is instantiated with all possible values in its corresponding domain to derive all possible patterns of the template. Sequences are grouped based on the patterns they possess. The answer to a PBA query is a sequence cuboid (s-cuboid), which is a multidimensional array of cells. Each cell is associated with a pattern instantiated from the query's pattern template. The value of each s-cuboid cell is obtained by applying the aggregate function F to the set of data sequences that belong to that cell. Since a pattern template can involve many variables and can be arbitrarily long, the induced s-cuboid for a PBA query can be huge. For most analytical tasks, however, only iceberg cells with very large aggregate values are of interest. This paper proposes an efficient approach to identify and evaluate iceberg cells of s-cuboids. Experimental results show that our algorithms are orders of magnitude faster than existing approaches.