The relational model for database management: version 2
The relational model for database management: version 2
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
Query evaluation in probabilistic relational databases
Selected papers from the international workshop on Uncertainty in databases and deductive systems
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fundamentals of Database Systems
Fundamentals of Database Systems
Mining database structure; or, how to build a data quality browser
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Evaluating Aggregate Operations Over Imprecise Data
IEEE Transactions on Knowledge and Data Engineering
Aggregation of Imprecise and Uncertain Information in Databases
IEEE Transactions on Knowledge and Data Engineering
Involving Aggregate Functions in Multi-relational Search
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Summarizability in OLAP and Statistical Data Bases
SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
On the decidability and complexity of query answering over inconsistent and incomplete databases
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Scalar aggregation in inconsistent databases
Theoretical Computer Science - Database theory
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
OLAP Databases and Aggregation Functions
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
A Logical Framework for Querying and Repairing Inconsistent Databases
IEEE Transactions on Knowledge and Data Engineering
Data integration under integrity constraints
Information Systems - Special issue: The 14th international conference on advanced information systems engineering (CAiSE*02)
Computing consistent query answers using conflict hypergraphs
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Aggregate operators in probabilistic databases
Journal of the ACM (JACM)
ConQuer: efficient management of inconsistent databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Consistent query answering under key and exclusion dependencies: algorithms and experiments
Proceedings of the 14th ACM international conference on Information and knowledge management
OLAP over uncertain and imprecise data
The VLDB Journal — The International Journal on Very Large Data Bases
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming and Delivering Data
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Measuring referential integrity in distributed databases
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Referential integrity quality metrics
Decision Support Systems
OLAP over imprecise data with domain constraints
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Estimating and bounding aggregations in databases with referential integrity errors
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Repairing OLAP queries in databases with referential integrity errors
DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Detecting summarizability in OLAP
Data & Knowledge Engineering
Hi-index | 0.00 |
Querying inconsistent databases remains a broad and difficult problem. In this work, we study how to improve aggregations computed on databases with referential errors in the context of database integration, where each source database has different tables, columns with similar content across multiple databases, but different referential integrity constraints. Thus, a query in an integrated database may involve tables and columns with referential integrity errors. In a data warehouse, even though the ETL processes fix referential integrity errors, this is generally done by inserting ''dummy'' records into the dimension tables corresponding to such invalid foreign keys, thereby artificially enforcing referential integrity. When two tables are joined and aggregations are computed, rows with an invalid or null foreign key value are skipped, effectively eliminating potentially valuable information. With that motivation in mind, we extend SQL aggregate functions computed over tables with referential integrity issues to return complete answer sets in the sense that no row is excluded. We associate to each referenced key in the dimension table, a probability that invalid or null foreign keys refer to it. Our main idea is to compute aggregations over joined tables including rows with invalid or null references by distributing their contribution to aggregation totals, based on probabilities computed over correct foreign keys. Experiments with real and synthetic databases evaluate the usefulness, accuracy and performance of our extended aggregations.