Characteristic relational patterns

Authors:
Arne Koopman;Arno Siebes
Affiliations:
Universiteit Utrecht, Utrecht, Netherlands;Universiteit Utrecht, Utrecht, Netherlands
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 9
Cited 2

An introduction to Kolmogorov complexity and its applications

An introduction to Kolmogorov complexity and its applications
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Algorithms for inferring functional dependencies from relations

Data & Knowledge Engineering
Probabilistic frame-based systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Mining relational patterns from multiple relational tables

Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Discovery of frequent DATALOG patterns

Data Mining and Knowledge Discovery
CrossMine: Efficient Classification Across Multiple Database Relations

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Database dependency discovery: a machine learning approach

AI Communications
FAT-miner: mining frequent attribute trees

Proceedings of the 2007 ACM symposium on Applied computing

Krimp: mining itemsets that compress

Data Mining and Knowledge Discovery
Interesting pattern mining in multi-relational data

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research in relational data mining has two major directions: finding global models of a relational database and the discovery of local relational patterns within a database. While relational patterns show how attribute values co-occur in detail, their huge numbers hamper their usage in data analysis. Global models, on the other hand, only provide a summary of how different tables and their attributes relate to each other, lacking detail of what is going on at the local level. In this paper we introduce a new approach that combines the positive properties of both directions: it provides a detailed description of the complete database using a small set of patterns. More in particular, we utilise a rich pattern language and show how a database can be encoded by such patterns. Then, based on the MDLprinciple, the novel RDB-KRIMP algorithm selects the set of patterns that allows for the most succinct encoding of the database. This set, the code table, is a compact description of the database in terms of local relational patterns. We show that this resulting set is very small, both in terms of database size and in number of its local relational patterns: a reduction of up to 4 orders of magnitude is attained.