Programming relational databases for Itemset mining over large transactional tables

  • Authors:
  • Ronnie Alves;Orlando Belo

  • Affiliations:
  • Department of Informatics, University of Minho, Braga, Portugal;Department of Informatics, University of Minho, Braga, Portugal

  • Venue:
  • EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the itemsetmining approaches are memory-like and run outside of the database. On the other hand, when we deal with data warehouse the size of tables is extremely huge for memory copy. In addition, using a pure SQL-like approach is quite inefficient. Actually, those implementations rarely take advantages of database programming. Furthermore, RDBMS vendors offer a lot of features for taking control and management of the data. We purpose a pattern growth mining approach by means of database programming for finding allfrequent itemsets. The main idea is to avoid one-at-a-time record retrieval from the database, saving both the copying and process context switching, expensive joins, and table reconstruction. The empirical evaluation of our approach shows that runs competitively with the most known itemset mining implementations based on SQL. Our performance evaluation was made with SQL Server 2000 (v.8) and T-SQL, throughout several synthetical datasets.