Incomplete Information in Relational Databases
Journal of the ACM (JACM)
Design by exmple: An application of Armstrong relations
Journal of Computer and System Sciences
Functional dependencies and constraints on Null values in database relations
Information and Control
Algorithms for inferring functional dependencies from relations
Data & Knowledge Engineering
Identifying the Minimal Transversals of a Hypergraph and Related Problems
SIAM Journal on Computing
On the Structure of Armstrong Relations for Functional Dependencies
Journal of the ACM (JACM)
A relational model of data for large shared data banks
Communications of the ACM
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Semantic sampling of existing databases through informative Armstrong databases
Information Systems
Design by example for SQL table definitions with functional dependencies
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Keys play a fundamental role in all data models. They allow database systems to uniquely identify data items, and therefore promote efficient data processing in most applications. Due to this role support is required to discover keys. These include keys that are semantically meaningful for the application domain, or are satisfied by a given database instance. Here, we study the discovery of keys from SQL tables. We investigate structural and computational properties of Armstrong tables for sets of SQL keys that are currently perceived as semantically meaningful. Inspections of Armstrong tables enable data engineers to consolidate their understanding of the semantics of the application domain, and communicate this understanding to other stake-holders of the database, e.g. domain experts or managers. The stake-holders may want to make changes to the tables or provide entirely different tables in order to communicate their expert views to the data engineers. For such purpose we propose data mining algorithms that discover keys from a given SQL table. Finally, we define formal measures to assess the distance between sets of SQL keys. The measures can be applied to empirically validate the usefulness of Armstrong tables, and to automate marking and feedback of non-multiple choice questions in database courses.