Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SQL-based discovery of exact and approximate functional dependencies
Working group reports from ITiCSE on Innovation and technology in computer science education
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Database dependency discovery: a machine learning approach
AI Communications
The data analytics group at the qatar computing research institute
ACM SIGMOD Record
ACM SIGMOD Record
Hi-index | 0.00 |
Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the well-known Gordian algorithm [9] and "Apriori-based" algorithms [4] are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCA-Gordian combines the advantages of Gordian and our new algorithm HCA, and it outperforms all previous work in many situations.