Novel parallel method for mining frequent patterns on multi-core shared memory systems

Authors:
Lan Vu;Gita Alaghband
Affiliations:
University of Colorado Denver, Lawrence St. Denver, CO;University of Colorado Denver, Lawrence St. Denver, CO
Venue:
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Year:
2013

Citing 16
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Parallel Association Rule Mining without Candidacy Generation

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SQL Based Association Rule Mining Using Commercial RDBMS (IBM DB2 UBD EEE)

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
MAFIA: A Maximal Frequent Itemset Algorithm

IEEE Transactions on Knowledge and Data Engineering
An implementation of the FP-growth algorithm

Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Computing frequent itemsets inside oracle 10G

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Pfp: parallel fp-growth for query recommendation

Proceedings of the 2008 ACM conference on Recommender systems
Some Observations of Sequential, Parallel and Distributed Association Rule Mining Algorithms

ICCAE '09 Proceedings of the 2009 International Conference on Computer and Automation Engineering
Parallel and Distributed Frequent Pattern Mining in Large Databases

HPCC '09 Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications
Tree partition based parallel frequent pattern mining on shared memory systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern mining is an important problem in data mining with many practical applications. Current parallel methods for mining frequent patterns unstably perform for different database types and under-utilize the benefits of multi-core shared memory machines. We present ShaFEM, a novel parallel frequent pattern mining method, to address these issues. Our method can dynamically adapt to the data characteristics to efficiently perform on both sparse and dense databases. Its parallel mining lock free approach minimizes the synchronization needs and maximizes the data independence to enhance the scalability. Its structure lends itself well for dynamic job scheduling resulting in well-balanced load on new multi-core shared memory architectures. We evaluate ShaFEM on a 12-core multi-socket server and find that our method runs 2.1--5.8 times faster than the state-of-the-art parallel method. For some test cases, we have shown that ShaFEM saves 4.9 days and 12.8 hours of execution time over the compared method.