Incomplete Information in Relational Databases
Journal of the ACM (JACM)
On the integrity of databases with incomplete information
PODS '86 Proceedings of the fifth ACM SIGACT-SIGMOD symposium on Principles of database systems
Direct transitive closure algorithms: design and performance evaluation
ACM Transactions on Database Systems (TODS)
A performance study of transitive closure algorithms
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Aggregate Queries Over Conditional Tables
Journal of Intelligent Information Systems
Foundations of Aggregation Constraints
PPCP '94 Proceedings of the Second International Workshop on Principles and Practice of Constraint Programming
Summarizability in OLAP and Statistical Data Bases
SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
Scalar aggregation in inconsistent databases
Theoretical Computer Science - Database theory
Multidimensional databases: problems and solutions
Multidimensional databases: problems and solutions
Reasoning about Uncertainty
Computing consistent query answers using conflict hypergraphs
Proceedings of the thirteenth ACM international conference on Information and knowledge management
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Consistent query answering in databases
ACM SIGMOD Record
Efficient allocation algorithms for OLAP over imprecise data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Consistent query answers on numerical databases under aggregate constraints
DBPL'05 Proceedings of the 10th international conference on Database Programming Languages
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sampling cube: a framework for statistical olap over sampling data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting OLAP operations over imperfectly integrated taxonomies
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query answering techniques on uncertain and probabilistic data: tutorial summary
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Systems aspects of probabilistic data management
Proceedings of the VLDB Endowment
Foundations and Trends in Databases
Privacy-Preserving Data Publishing
Foundations and Trends in Databases
Extended aggregations for databases with referential integrity issues
Data & Knowledge Engineering
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs
Information Sciences: an International Journal
Graph cube: on warehousing and OLAP multidimensional networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Can the Utility of Anonymized Data be Used for Privacy Breaches?
ACM Transactions on Knowledge Discovery from Data (TKDD)
DuoWave: Mitigating the curse of dimensionality for uncertain data
Data & Knowledge Engineering
Aggregate queries on probabilistic record linkages
Proceedings of the 15th International Conference on Extending Database Technology
HMGraph OLAP: a novel framework for multi-dimensional heterogeneous network analysis
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
OLAPing social media: the case of Twitter
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
Several recent papers have focused on OLAP over imprecise data, where each fact can be a region, instead of a point, in a multi-dimensional space. They have provided a multiple-world semantics for such data, and developed efficient ways to answer OLAP aggregation queries over the imprecise facts. These solutions, however, assume that the imprecise facts can be interpreted independently of one another, a key assumption that is often violated in practice. Indeed, imprecise facts in real-world applications are often correlated, and such correlations can be captured as domain integrity constraints (e.g., repairs with the same customer names and models took place in the same city, or a text span can refer to a person or a city, but not both). In this paper we provide a framework for answering OLAP aggregation queries over imprecise data in the presence of such domain constraints. We first describe a relatively simple yet powerful constraint language, and formalize what it means to take into account such constraints in query answering. Next, we prove that OLAP queries can be answered efficiently given a database D* of fact marginals. We then exploit the regularities in the constraint space (captured in a constraint hypergraph) and the fact space to efficiently construct D*. We present extensive experiments over real-world and synthetic data to demonstrate the effectiveness of our approach.