Statistical treatment of the information content of a database
Information Systems
IEEE Transactions on Software Engineering
Approximate inference of functional dependencies from relations
ICDT '92 Selected papers of the fourth international conference on Database theory
Automated database schema design using mined data dependencies
Journal of the American Society for Information Science - Special issue: knowledge discovery and data mining
An introduction to database systems (7th ed.)
An introduction to database systems (7th ed.)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database Management Systems
Functional and embedded dependency inference: a data mining point of view
Information Systems - Special issue on Databases: creation, management and utilization
The Theory of Probabilistic Databases
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
An Axiomatic Approach to Defining Approximation Measures for Functional Dependencies
ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Improving Query Evaluation with Approximate Functional Dependency Based Decompositions
BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Discovery of multivalued dependencies from relations
Intelligent Data Analysis
Hi-index | 0.89 |
We consider the problem of defining a normalized approximation measure for multi-valued dependencies in relational database theory. An approximation measure is a function mapping relation instances to real numbers. The number to which an instance is mapped, intuitively, describes the strength of the dependency in that instance. A normalized approximation measure for functional dependencies has been proposed previously: the minimum number of tuples that need be removed for the functional dependency to hold divided by the total number of tuples. This leads naturally to a normalized measure for multivalued dependencies: the minimum number of tuples that need be removed for the multi-valued dependency to hold divided by the total number of tuples.The measure for functional dependencies can be computed efficiently, O(|r|log(|r|)) where |r| is the relation instance. However, we show that an efficient algorithm for computing the analogous measure for multi-valued dependencies is not likely to exist. A polynomial time algorithm for computing the measure would lead to a polynomial time algorithm for an NP-complete problem (proven by a reduction from the maximum edge biclique problem in graph theory). Hence, we argue that it is not a good measure. We propose an alternate measure based on the lossless join characterization of multi-valued dependencies. This measure is efficiently computable, O(|r|2).