Performance Evaluation and Optimization of Join Queries for Association Rule Mining

Authors:
Shiby Thomas;Sharma Chakravarthy
Affiliations:
-;-
Venue:
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Year:
1999

Citing 6
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Query flocks: a generalization of association-rule mining

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Set-Oriented Mining for Association Rules in Relational Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A New SQL-like Operator for Mining Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

SQL Based Association Rule Mining Using Commercial RDBMS (IBM DB2 UBD EEE)

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
SQL based frequent pattern mining with FP-Growth

INAP'04/WLP'04 Proceedings of the 15th international conference on Applications of Declarative Programming and Knowledge Management, and 18th international conference on Workshop on Logic Programming
Shaping SQL-Based frequent pattern mining algorithms

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The explosive growth in data collection in business organizations introduces the problem of turning these rapidly expanding data stores into nuggets of actionable knowledge. The state-of-the-art data mining tools available for this integrate loosely with data stored in DBMSs, typically through a cursor interface. In this paper, we consider several formulations of association rule mining (a typical data mining problem) using SQL-92 queries and study the performance of different join orders and join methods for executing them. We analyze the cost of the different execution plans which provides a basis to incorporate the semantics of association rule mining into future query optimizers. Based on them we identify certain optimizations and develop the Set-oriented Apriori approach. This work is an initial step towards developing "SQL-aware" mining algorithms and exploring the enhancements to current relational DBMSs to make them "mining-aware" thereby bridging the gap between the two.