Quasi-cubes: exploiting approximations in multidimensional databases

Authors:
Daniel Barbará;Mark Sullivan
Affiliations:
Bell Communications Research, 445 South St., Morristown, N.J.;Juno Online Services, 120 West 45th Street, 39th floor, New York, NY
Venue:
ACM SIGMOD Record
Year:
1997

Citing 0
Cited 34

Using approximations to scale exploratory data analysis in datacubes

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A survey in indexing and searching XML documents

Journal of the American Society for Information Science and Technology - XML
Loglinear-Based Quasi Cubes

Journal of Intelligent Information Systems
Using Loglinear Models to Compress Datacube

WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Compressed Datacubes for fast OLAP Applications

DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Supporting Online Queries in ROLAP

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Computing Full and Iceberg Datacubes Using Partitions

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Managing and analyzing massive data sets with data cubes

Handbook of massive data sets
pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Transmitting Datacubes over Congested Networks

ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Incomplete information in multidimensional databases

Multidimensional databases
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Incremental maintenance of quotient cube for median

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental maintenance of quotient cube based on Galois lattice

Journal of Computer Science and Technology
Analytical processing of XML documents: opportunities and challenges

ACM SIGMOD Record
Using Datacube Aggregates for Approximate Querying and Deviation Detection

IEEE Transactions on Knowledge and Data Engineering
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

IEEE Transactions on Knowledge and Data Engineering
Optimized stratified sampling for approximate query processing

ACM Transactions on Database Systems (TODS)
Answering ad hoc aggregate queries from data streams using prefix aggregate trees

Knowledge and Information Systems
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Mining multiple-level fuzzy blocks from multidimensional data

Fuzzy Sets and Systems
Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
A Multiple Correspondence Analysis to Organize Data Cubes

Proceedings of the 2007 conference on Databases and Information Systems IV: Selected Papers from the Seventh International Baltic Conference DB&IS'2006
A cubic-wise balance approach for privacy preservation in data cubes

Information Sciences: an International Journal
Mining multi-dimensional frequent patterns without data cube construction

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Pixelizing data cubes: a block-based approach

VIEW'06 Proceedings of the 1st first visual information expert conference on Pixelization paradigm
Adapting OLAP analysis to the user's interest through virtual cubes

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
On the computation of maximal-correlated cuboids cells

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

A data cube is a popular organization for summary data. A cube is simply a multidimensional structure that contains at each point an aggregate value, i.e., the result of applying an aggregate function to an underlying relation. In practical situations, cubes can require a large amount of storage. The typical approach to reducing storage cost is to materialize parts of the cube on demand. Unfortunately, this lazy evaluation can be a time-consuming operation.In this paper, we describe an approximation technique that reduces the storage cost of the cube without incurring the run time cost of lazy evaluation. The idea is to provide an incomplete description of the cube and a method of estimating the missing entries with a certain level of accuracy. The description, of course, should take a fraction of the space of the full cube and the estimation procedure should be faster than computing the data from the underlying relations. Since cubes are used to support data analysis and analysts are rarely interested in the precise values of the aggregates (but rather in trends), providing approximate answers is, in most cases, a satisfactory compromise.Alternatively, the technique can be used to implement a multiresolution system in which a tradeoff is established between the execution time of queries and the errors the user is willing to tolerate. By only going to the disk when it is necessary (to reduce the errors), the query can be executed faster. This idea can be extended to produce a system that incrementally increases the accuracy of the answer while the user is looking at it, supporting on-line aggregation.