Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Asynchronous parallel algorithm for mining association rules on a shared-memory multi-processors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Communication-efficient distributed mining of association rules
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel data mining for association rules on shared memory systems
Knowledge and Information Systems
Shared State for Distributed Interactive Data Mining Applications
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Scalable Techniques for Mining Causal Structures
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Mining Frequent Itemsets in Distributed and Dynamic Databases
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A High-Performance Distributed Algorithm for Mining Association Rules
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Frequent Pattern Mining on Message Passing Multiprocessor Systems
Distributed and Parallel Databases
A sampling-based framework for parallel data mining
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Out-of-core frequent pattern mining on a commodity PC
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Pfp: parallel fp-growth for query recommendation
Proceedings of the 2008 ACM conference on Recommender systems
Frequent itemset mining on graphics processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
A distributed placement service for graph-structured and tree-structured data
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Memory-efficient frequent-itemset mining
Proceedings of the 14th International Conference on Extending Database Technology
CLAP: Collaborative pattern mining for distributed information systems
Decision Support Systems
Proceedings of the WICSA/ECSA 2012 Companion Volume
PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce
Proceedings of the 21st ACM international conference on Information and knowledge management
Parallel frequent itemset mining using systolic arrays
Knowledge-Based Systems
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Mind the gap: large-scale frequent sequence mining
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Efficient mining of frequent itemsets in social network data based on MapReduce framework
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
We present a strategy for mining frequent item sets from terabyte-scale data sets on cluster systems. The algorithm embraces the holistic notion of architecture-conscious datamining, taking into account the capabilities of the processor, the memory hierarchy and the available network interconnects. Optimizations have been designed for lowering communication costs using compressed data structures and a succinct encoding. Optimizations for improving cache, memory and I/O utilization using pruningand tiling techniques, and smart data placement strategies are also employed. We leverage the extended memory spaceand computational resources of a distributed message-passing clusterto design a scalable solution, where each node can extend its metastructures beyond main memory by leveraging 64-bit architecture support. Our solution strategy is presented in the context of FPGrowth, a well-studied and rather efficient frequent pattern mining algorithm. Results demonstrate that the proposed strategy result in near-linearscaleup on up to 48 nodes.