A Unified Approach to the Change of Resolution: Space and Gray-Level
IEEE Transactions on Pattern Analysis and Machine Intelligence
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining database structure; or, how to build a data quality browser
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Introduction to Algorithms
Inclusion dependencies and their interaction with functional dependencies
PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Discovering interesting inclusion dependencies: application to logical database tuning
Information Systems - Databases: Creation, management and utilization
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
On schema matching with opaque column names and data values
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Zigzag: a new algorithm for mining large inclusion dependencies in databases
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On synopses for distinct-value estimation under multiset operations
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Unary and n-ary inclusion dependency discovery in relational databases
Journal of Intelligent Information Systems
Robust approximate aggregation in sensor data management systems
ACM Transactions on Database Systems (TODS)
Leveraging discarded samples for tighter estimation of multiple-set aggregates
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Automatic discovery of attributes in relational databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient filtering and ranking schemes for finding inclusion dependencies on the web
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
APCCM '12 Proceedings of the Eighth Asia-Pacific Conference on Conceptual Modelling - Volume 130
Discovering linkage points over web data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
A foreign/primary key relationship between relational tables is one of the most important constraints in a database. From a data analysis perspective, discovering foreign keys is a crucial step in understanding and working with the data. Nevertheless, more often than not, foreign key constraints are not specified in the data, for various reasons; e.g., some associations are not known to designers but are inherent in the data, while others become invalid due to data inconsistencies. This work proposes a robust algorithm for discovering single-column and multi-column foreign keys. Previous work concentrated mostly on discovering single-column foreign keys using a variety of rules, like inclusion dependencies, column names, and minimum/maximum values. We first propose a general rule, termed Randomness, that subsumes a variety of other rules. We then develop efficient approximation algorithms for evaluating randomness, using only two passes over the data. Finally, we validate our approach via extensive experiments using real and synthetic datasets.