Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Quasi-cubes: exploiting approximations in multidimensional databases
ACM SIGMOD Record
Bitmap index design and evaluation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Efficient computation of Iceberg cubes with complex measures
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Modern Information Retrieval
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Encoded Bitmap Indexing for Data Warehouses
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimizing Queries on Compressed Bitmaps
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Star-cubing: computing iceberg cubes by top-down and bottom-up integration
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
Distributed and Parallel Databases
CURE for cubes: cubing using a ROLAP engine
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Answering top-k queries with multi-dimensional selections: the ranking cube approach
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach
IEEE Transactions on Knowledge and Data Engineering
Mining approximate top-k subspace anomalies in multi-dimensional time-series data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
DataScope: viewing database contents in Google Maps' way
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
ARCube: supporting ranking aggregate queries in partially materialized data cubes
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sampling cube: a framework for statistical olap over sampling data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting the data cube lifecycle: the power of ROLAP
The VLDB Journal — The International Journal on Very Large Data Bases
Bellwether analysis: Searching for cost-effective query-defined predictors in large databases
ACM Transactions on Knowledge Discovery from Data (TKDD)
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
A view selection algorithm with performance guarantee
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A Multiple Correspondence Analysis to Organize Data Cubes
Proceedings of the 2007 conference on Databases and Information Systems IV: Selected Papers from the Seventh International Baltic Conference DB&IS'2006
CAMS: OLAPing Multidimensional Data Streams Efficiently
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
BitCube: A Bottom-Up Cubing Engineering
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Mining significant change patterns in multidimensional spaces
International Journal of Business Intelligence and Data Mining
Graph OLAP: a multi-dimensional framework for graph data analysis
Knowledge and Information Systems
Mining multi-dimensional frequent patterns without data cube construction
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
PHC: a rapid parallel hierarchical cubing algorithm on high dimensional OLAP
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
S-OLAP: an OLAP system for analyzing sequence data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs
Information Sciences: an International Journal
Differentially private data cubes: optimizing noise sources and consistency
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Graph cube: on warehousing and OLAP multidimensional networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
MOLAP cube based on parallel scan algorithm
ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
Ag-Tree: a novel structure for range queries in data warehouse environments
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
An efficient indexing technique for computing high dimensional data cubes
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Dynamic construction of user defined virtual cubes
NGITS'06 Proceedings of the 6th international conference on Next Generation Information Technologies and Systems
HMGraph OLAP: a novel framework for multi-dimensional heterogeneous network analysis
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Mining top-K multidimensional gradients
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.01 |
Data cube has been playing an essential role in fast OLAP (online analytical processing) in many multi-dimensional data warehouses. However, there exist data sets in applications like bioinformatics, statistics, and text processing that are characterized by high dimensionality, e.g., over 100 dimensions, and moderate size, e.g., around 106 tuples. No feasible data cube can be constructed with such data sets. In this paper we will address the problem of developing an efficient algorithm to perform OLAP on such data sets. Experience tells us that although data analysis tasks may involve a high dimensional space, most OLAP operations are performed only on a small number of dimensions at a time. Based on this observation, we propose a novel method that computes a thin layer of the data cube together with associated value-list indices. This layer, while being manageable in size, will be capable of supporting flexible and fast OLAP operations in the original high dimensional space. Through experiments we will show that the method has I/O costs that scale nicely with dimensionality. Furthermore, the costs are comparable to that of accessing an existing data cube when full materialization is possible.