Elements of information theory
Elements of information theory
A universal-scheme approach to statistical databases containing homogeneous summary tables
ACM Transactions on Database Systems (TODS)
View maintenance issues for the chronicle data model (extended abstract)
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A maximum entropy approach to natural language processing
Computational Linguistics
Quasi-cubes: exploiting approximations in multidimensional databases
ACM SIGMOD Record
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Snakes and sandwiches: optimal clustering strategies for a data warehouse
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Prediction with local patterns using cross-entropy
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Answering complex SQL queries using automatic summary tables
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Exploiting statistics on query expressions for optimization
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Approximate Query Processing with Summary Tables in Statistical Databases
EDBT '92 Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Recovering Information from Summary Data
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Probabilistic Models for Query Approximation with Large Sparse Binary Data Sets
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
SMART: making DB2 (more) autonomic
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Regression Cubes with Lossless Compression and Aggregation
IEEE Transactions on Knowledge and Data Engineering
A probabilistic model for data cube compression and query approximation
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Knowledge Mining for the Business Analyst
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Built-In Indicators to Discover Interesting Drill Paths in a Cube
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Improving estimation accuracy of aggregate queries on data cubes
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
An intelligent questionnaire analysis expert system
Expert Systems with Applications: An International Journal
What Can Formal Concept Analysis Do for Data Warehouses?
ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
View Discovery in OLAP Databases through Statistical Combinatorial Optimization
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Embedded indicators to facilitate the exploration of a data cube
International Journal of Business Intelligence and Data Mining
Improving estimation accuracy of aggregate queries on data cubes
Data & Knowledge Engineering
Measure-driven keyword-query expansion
Proceedings of the VLDB Endowment
A knowledge mining framework for business analysts
ACM SIGMIS Database
Towards intensional answers to OLAP queries for analytical sessions
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Approximate answers to OLAP queries on streaming data warehouses
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
A neural-based approach for extending OLAP to prediction
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
Much research has been devoted to the efficient computation of relational aggregations and, specifically, the efficient execution of the datacube operation. In this paper, we consider the inverse problem, that of deriving (approximately) the original data from the aggregates. We motivate this problem in the context of two specific application areas, approximate query answering and data analysis. We propose a framework based on the notion of information entropy that enables us to estimate the original values in a data set, given only aggregated information about it. We then show how approximate queries on the data from which the aggregates were derived can be performed using our framework. We also describe an alternate use of the proposed framework that enables us to identify values that deviate from the underlying data distribution, suitable for data mining purposes. We present a detailed performance study of the algorithms using both real and synthetic data, highlighting the benefits of our approach as well as the efficiency of the proposed solutions. Finally, we evaluate our techniques with a case study on a real data set, which illustrates the applicability of our approach.