Programming relational databases for Itemset mining over large transactional tables

Authors:
Ronnie Alves;Orlando Belo
Affiliations:
Department of Informatics, University of Minho, Braga, Portugal;Department of Informatics, University of Minho, Braga, Portugal
Venue:
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Year:
2005

Citing 12
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Online association rule mining

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using SQL to Build New Aggregates and Extenders for Object- Relational Systems

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SQL Based Association Rule Mining Using Commercial RDBMS (IBM DB2 UBD EEE)

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Enhancing the Apriori Algorithm for Frequent Set Counting

DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Adaptive and Resource-Aware Mining of Frequent Sets

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Processing frequent itemset discovery queries by division and set containment join operators

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
SQL based frequent pattern mining without candidate generation

Proceedings of the 2004 ACM symposium on Applied computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of the itemsetmining approaches are memory-like and run outside of the database. On the other hand, when we deal with data warehouse the size of tables is extremely huge for memory copy. In addition, using a pure SQL-like approach is quite inefficient. Actually, those implementations rarely take advantages of database programming. Furthermore, RDBMS vendors offer a lot of features for taking control and management of the data. We purpose a pattern growth mining approach by means of database programming for finding allfrequent itemsets. The main idea is to avoid one-at-a-time record retrieval from the database, saving both the copying and process context switching, expensive joins, and table reconstruction. The empirical evaluation of our approach shows that runs competitively with the most known itemset mining implementations based on SQL. Our performance evaluation was made with SQL Server 2000 (v.8) and T-SQL, throughout several synthetical datasets.