Efficient mining of frequent itemsets in social network data based on MapReduce framework

Authors:
Zahra Farzanyar;Nick Cercone
Affiliations:
York University, Toronto, Canada;York University, Toronto, Canada
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 16
Cited 0

Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance

IEEE Transactions on Knowledge and Data Engineering
A sampling-based framework for parallel data mining

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel Leap: Large-Scale Maximal Pattern Mining in a Distributed Environment

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
A Parallel Apriori Algorithm for Frequent Itemsets Mining

SERA '06 Proceedings of the Fourth International Conference on Software Engineering Research, Management and Applications
Toward terabyte pattern mining: an architecture-conscious solution

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Hash Partitioned apriori in Parallel and Distributed Data Mining Environment with Dynamic Data Allocation Approach

ICCSIT '08 Proceedings of the 2008 International Conference on Computer Science and Information Technology
Maximizing the Efficiency of Parallel Apriori Algorithm

ARTCOM '09 Proceedings of the 2009 International Conference on Advances in Recent Technologies in Communication and Computing
Parallel Frequent Item Set Mining with Selective Item Replication

IEEE Transactions on Parallel and Distributed Systems
The Strategy of Mining Association Rule Based on Cloud Computing

BCGIN '11 Proceedings of the 2011 International Conference on Business Computing and Global Informatization
Apriori-based frequent itemset mining algorithms on MapReduce

Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Parallel Implementation of Apriori Algorithm Based on MapReduce

SNPD '12 Proceedings of the 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social Networks promote information sharing between people everywhere and at all times. Mining data produced in this data-rich environment can be extremely useful. Frequent itemset mining plays an important role in mining associations, correlations, sequential patterns, causality, episodes, multidimensional patterns, max-patterns, partial periodicity, emerging patterns, and many other significant data mining tasks in social networks. With the exponential growth of social network data towards a terabyte or more, most of the traditional frequent itemset mining algorithms become ineffective due to either huge resource requirements or large communications overhead. Cloud computing has proved that processing very large datasets over commodity clusters can be done by providing the right programming model. As a parallel programming model, MapReduce, one of most important techniques for cloud computing, has emerged in the mining of datasets of terabyte scale or larger on clusters of computers. In this paper, we propose an efficient frequent itemset mining algorithm, called IMRApriori, based on MapReduce framework which deals with Hadoop cloud, a parallel store and computing platform. The paper demonstrates experimental results to corroborate the theoretical claims.