Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 10th international conference on World Wide Web
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
On XML integrity constraints in the presence of DTDs
Journal of the ACM (JACM)
DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
A normal form for XML documents
ACM Transactions on Database Systems (TODS)
Strong functional dependencies and their application to normal forms in XML
ACM Transactions on Database Systems (TODS)
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Consistent data for inconsistent XML document
Information and Software Technology
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Information preserving XML schema embedding
ACM Transactions on Database Systems (TODS)
XPath satisfiability in the presence of DTDs
Journal of the ACM (JACM)
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On approximating optimum repairs for functional dependency violations
Proceedings of the 12th International Conference on Database Theory
Expressive, yet tractable XML keys
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Efficient reasoning about a robust XML key fragment
ACM Transactions on Database Systems (TODS)
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents
Information Sciences: an International Journal
Consistent query answers from virtually integrated XML data
Journal of Systems and Software
Sampling the repairs of functional dependency violations under hard constraints
Proceedings of the VLDB Endowment
Improving XML data quality with functional dependencies
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Consistent query answering: five easy pieces
ICDT'07 Proceedings of the 11th international conference on Database Theory
Querying and repairing inconsistent XML data
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Reasoning about functional and full hierarchical dependencies over partial relations
Information Sciences: an International Journal
Hi-index | 0.07 |
We study the problem of repairing XML functional dependency violations by making the smallest modifications in terms of repair cost. Our cost model assigns a weight to each leaf node in the XML document, and the cost of a repair is measured by the total weight of the modified nodes. We define an optimum repair as the repair with the minimum cost among all of the repairs. We prove lower and upper bounds for the optimum XML repair problem. We show that, in practice, it is beyond reach to find the optimum repairs; this problem is already NP-complete for a setting with a fixed DTD, a fixed set of functional dependencies, and equal weights for all of the nodes in the XML document. Instead, we provide an efficient two-step heuristic method to repair XML functional dependency violations. First, the initial violations are captured and fixed by leveraging the conflict hypergraph. Second, the remaining conflicts are resolved by modifying the violating nodes and their related nodes called determinants in a way that guarantees no new violations. We implement our method and evaluate it on synthetic and real-life data. The experimental results demonstrate that our algorithm scales well and is effective at improving data quality.