Incomplete Information in Relational Databases
Journal of the ACM (JACM)
The impact of poor data quality on the typical enterprise
Communications of the ACM
Logical approaches to incomplete information: a survey
Logics for databases and information systems
Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Theory of Data Dependencies - An Overview
Proceedings of the 11th Colloquium on Automata, Languages and Programming
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Extending dependencies with conditions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Conditional functional dependencies for capturing data inconsistencies
ACM Transactions on Database Systems (TODS)
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Propagating functional dependencies with conditions
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
Semandaq: a data quality system based on conditional functional dependencies
Proceedings of the VLDB Endowment
Master Data Management
Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques
Increasing the Expressivity of Conditional Functional Dependencies without Extra Complexity
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Discovering Conditional Functional Dependencies
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Relative information completeness
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimal-change integrity maintenance using tuple deletions
Information and Computation
Consistent query answering: five easy pieces
ICDT'07 Proceedings of the 11th international conference on Database Theory
Querying and repairing inconsistent XML data
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Handling dirty databases: from user warning to data cleaning -- towards an interactive approach
SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
Support for user involvement in data cleaning
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Detecting suspect answers in the presence of inconsistent information
FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
Hi-index | 0.00 |
Real-life data is often dirty and costs billions of pounds to businesses worldwide each year. This paper presents a promising approach to improving data quality. It effectively detects and fixes inconsistencies in real-life data based on conditional dependencies, an extension of database dependencies by enforcing bindings of semantically related data values. It accurately identifies records from unreliable data sources by leveraging relative candidate keys, an extension of keys for relations by supporting similarity and matching operators across relations. In contrast to traditional dependencies that were developed for improving the quality of schema, the revised constraints are proposed to improve the quality of data. These constraints yield practical techniques for data repairing and record matching in a uniform framework.