Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Browsing large digital library collections using classification hierarchies
Proceedings of the eighth international conference on Information and knowledge management
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Handbook of massive data sets
Hi-index | 0.00 |
The paradigmatic view of data in decision support consists of a set of dimensions (e.g., location, product, time period, ...), each encoding a hierarchy (e.g., location has hemisphere, country, state/province, ..., block). Typical queries consist of aggregates over a quantifiable attribute (e.g., sales) as a function of at most one attribute in each dimension of this ``data cube.'''' For example, find the sum of all sales of blue polo shirts in Palm Beach during the last quarter. In this paper, we introduce an index structure for storing and indexing aggregates, called ``cube forests,'''' to support such cube queries efficiently --- one index search is usually enough. In their most general form, cube forests require a lot of space. So, we present an optimized structure, called ``hierarchically split cube forests'''' that exploit the hierarchical nature of the data to save space. We then present a model and algorithms to arrive at designs that further reduce update time, but suffer an increase in query time. Our experiments bear out the model and show that the structure has promise for decision support applications in read-intensive environments.