Unary and n-ary inclusion dependency discovery in relational databases

Authors:
Fabien De Marchi;Stéphane Lopes;Jean-Marc Petit
Affiliations:
Laboratoire LIRIS, Université de LYON, Université LYON 1, Villeurbanne, France 69621;Laboratoire PRISM, Université de Versailles Saint-Quentin en Yvelines, Versailles Cedex, France 78035;Laboratoire LIRIS, Université de LYON, INSA de Lyon, Villeurbanne, France 69621
Venue:
Journal of Intelligent Information Systems
Year:
2009

Citing 26
Cited 8

The implication problem for functional and inclusion dependencies

Information and Control
The design of relational databases

The design of relational databases
Approximate inference of functional dependencies from relations

ICDT '92 Selected papers of the fourth international conference on Database theory
Data mining: concepts and techniques

Data mining: concepts and techniques
The Clio project: managing heterogeneity

ACM SIGMOD Record
Foundations of Databases: The Logical Level

Foundations of Databases: The Logical Level
A Guided Tour of Relational Databases and Beyond

A Guided Tour of Relational Databases and Beyond
Mining database structure; or, how to build a data quality browser

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Functional and embedded dependency inference: a data mining point of view

Information Systems - Special issue on Databases: creation, management and utilization
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications

Data Mining and Knowledge Discovery
Discovering interesting inclusion dependencies: application to logical database tuning

Information Systems - Databases: Creation, management and utilization
Justification for Inclusion Dependency Normal Form

IEEE Transactions on Knowledge and Data Engineering
Analysis of existing databases at the logical level: the DBA companion project

ACM SIGMOD Record
Discovery of Constraints and Data Dependencies in Databases (Extended Abstract)

ECML '95 Proceedings of the 8th European Conference on Machine Learning
Efficient Algorithms for Mining Inclusion Dependencies

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Query Folding with Inclusion Dependencies

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Inclusion Dependencies in Database Design

Proceedings of the Second International Conference on Data Engineering
Implementation of Two Semantic Query Optimization Techniques in DB2 Universal Database

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Enforcing Inclusion Dependencies and Referencial Integrity

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Informal and Efficient Approach for Obtaining Semantic Constraints Using Sample Data and Natural Language Processing

Selected Papers from a Workshop on Semantics in Databases
On Monotone Data Mining Languages

DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
Zigzag: a new algorithm for mining large inclusion dependencies in databases

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
DBA Companion: A Tool for Logical Database Tuning

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

On multi-column foreign key discovery

Proceedings of the VLDB Endowment
Filtering and ranking schemes for finding inclusion dependencies on the web

Proceedings of the 21st international conference companion on World Wide Web
Discovering conditional inclusion dependencies

Proceedings of the 21st ACM international conference on Information and knowledge management
Mining frequent conjunctive queries using functional and inclusion dependencies

The VLDB Journal — The International Journal on Very Large Data Bases
Meta-modeling of inclusion dependency constraints

Proceedings of the 6th Balkan Conference in Informatics
Efficient filtering and ranking schemes for finding inclusion dependencies on the web

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Armstrong databases: validation, communication and consolidation of conceptual models with perfect test data

APCCM '12 Proceedings of the Eighth Asia-Pacific Conference on Conceptual Modelling - Volume 130
Data profiling revisited

ACM SIGMOD Record

Quantified Score

Hi-index	0.01

Visualization

Abstract

Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing databases, the discovery of foreign keys turns out to be an important and challenging task. The underlying problem is known to be the inclusion dependency (IND) inference problem. In this paper, data-mining algorithms are devised for IND inference in a given database. We propose a two-step approach. In the first step, unary INDs are discovered thanks to a new preprocessing stage which leads to a new algorithm and to an efficient implementation. In the second step, n-ary IND inference is achieved. This step fits in the framework of levelwise algorithms used in many data-mining algorithms. Since real-world databases can suffer from some data inconsistencies, approximate INDs, i.e. INDs which almost hold, are considered. We show how they can be safely integrated into our unary and n-ary discovery algorithms. An implementation of these algorithms has been achieved and tested against both synthetic and real-life databases. Up to our knowledge, no other algorithm does exist to solve this data-mining problem.