Parallel SQL Based Association Rule Mining on Large Scale PC Cluster: Performance Comparison with Directly Coded C Implementation

  • Authors:
  • Iko Pramudiono;Takahiko Shintani;Takayuki Tamura;Masaru Kitsuregawa

  • Affiliations:
  • -;-;-;-

  • Venue:
  • PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
  • Year:
  • 1999
  • SQL based frequent pattern mining with FP-Growth

    INAP'04/WLP'04 Proceedings of the 15th international conference on Applications of Declarative Programming and Knowledge Management, and 18th international conference on Workshop on Logic Programming

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database systems are dominated by relational database and the ability to perform data mining using standard SQL queries will definitely ease implementation of data mining. However the performance of SQL based data mining is known to fall behind specialized implementation. In this paper we present an evaluation of parallel SQL based data mining on large scale PC cluster. The performance achieved by parallelizing SQL query for mining association rule using 4 processing nodes is even with C based program.