Index Support for Frequent Itemset Mining in a Relational DBMS

Authors:
Elena Baralis;Tania Cerquitelli;Silvia Chiusano
Affiliations:
Politecnico di Torino;Politecnico di Torino;Politecnico di Torino
Venue:
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Year:
2005

Citing 12
Cited 1

Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Database Mining: A Performance Perspective

IEEE Transactions on Knowledge and Data Engineering
A Tightly-Coupled Architecture for Data Mining

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A New SQL-like Operator for Mining Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Comparison between Query Languages for the Extraction of Association Rules

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Integrating Data Mining with Relational DBMS: A Tightly-Coupled Approach

NGIT '99 Proceedings of the 4th International Workshop on Next Generation Information Technologies and Systems
Efficient Indexing Structures for Mining Frequent Patterns

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On computing, storing and querying frequent patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Shaping SQL-Based frequent pattern mining algorithms

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many efforts have been devoted to couple data mining activities with relational DBMSs, but a true integration into the relational DBMS kernel has been rarely achieved. This paper presents a novel indexing technique, which represents transactions in a succinct form, appropriate for tightly integrating frequent itemset mining in a relational DBMS. The data representation is complete, i.e., no support threshold is enforced, in order to allow reusing the index for mining itemsets with any support threshold. Furthermore, an appropriate structure of the stored information has been devised, in order to allow a selective access of the index blocks necessary for the current extraction phase. The index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Experiments have been run for various datasets, characterized by different data distributions. The execution time of the frequent itemset extraction task exploiting the index is always comparable with and sometime faster than a C++ implementation of the FP-growth algorithm accessing data stored on a flat file.