The design of relational databases
The design of relational databases
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Algorithms for inferring functional dependencies from relations
Data & Knowledge Engineering
Approximate inference of functional dependencies from relations
ICDT '92 Selected papers of the fourth international conference on Database theory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Searching for dependencies at multiple abstraction levels
ACM Transactions on Database Systems (TODS)
A Feasibility and Performance Study of Dependency Inference
Proceedings of the Fifth International Conference on Data Engineering
Efficient Discovery of Functional and Approximate Dependencies Using Partitions
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Neighborhood Dependencies for Prediction
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Text joins in an RDBMS for web data integration
WWW '03 Proceedings of the 12th international conference on World Wide Web
On approximation measures for functional dependencies
Information Systems - Special issue: ADBIS 2002: Advances in databases and information systems
CORDS: automatic discovery of correlations and soft functional dependencies
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Database dependency discovery: a machine learning approach
AI Communications
Improving data quality through effective use of data semantics
Data & Knowledge Engineering - Special issue: WIDM 2004
Finding association rules that trade support optimally against confidence
Intelligent Data Analysis
Extending dependencies with conditions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Propagating functional dependencies with conditions
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
Increasing the Expressivity of Conditional Functional Dependencies without Extra Complexity
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Discovering Conditional Functional Dependencies
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Metric Functional Dependencies
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Discovering matching dependencies
Proceedings of the 18th ACM conference on Information and knowledge management
Reasoning about record matching rules
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Data cleaning and query answering with matching dependencies and matching functions
Proceedings of the 14th International Conference on Database Theory
Privacy-preserving publishing microdata with full functional dependencies
Data & Knowledge Engineering
Proceedings of the 4th International Workshop on Logic in Databases
Interaction between record matching and data repairing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Differential dependencies: Reasoning and discovery
ACM Transactions on Database Systems (TODS)
On data dependencies in dataspaces
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Dynamic constraints for record matching
The VLDB Journal — The International Journal on Very Large Data Bases
Leveraging matching dependencies for guided user feedback in linked data applications
Proceedings of the Ninth International Workshop on Information Integration on the Web
Hi-index | 0.00 |
The concept of matching dependencies (mds) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), mds can also be applied to various data quality applications such as detecting the violations of integrity constraints. In this paper, we study the problem of discovering similarity constraints for matching dependencies from a given database instance. First, we introduce the measures, support and confidence, for evaluating the utility of mds in the given data. Then, we study the discovery of mds with certain utility requirements of support and confidence. Exact algorithms are developed, together with pruning strategies to improve the time performance. Since the exact algorithm has to traverse all the data during the computation, we propose an approximate solution which only uses part of the data. A bound of relative errors introduced by the approximation is also developed. Finally, our experimental evaluation demonstrates the efficiency of the proposed methods.