Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel sequence mining on shared-memory machines
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Approaches to parallel graph-based knowledge discovery
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Parallel data mining for association rules on shared memory systems
Knowledge and Information Systems
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules
IEEE Transactions on Knowledge and Data Engineering
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
LOGML: Log Markup Language for Web Usage Mining
WEBKDD '01 Revised Papers from the Third International Workshop on Mining Web Log Data Across All Customers Touch Points
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Discovery of Common Substructures in Macromolecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Active data mining in a distributed setting
Active data mining in a distributed setting
Parallel algorithms for mining frequent structural motifs in scientific data
Proceedings of the 18th annual international conference on Supercomputing
A quickstart in frequent structure mining can make a difference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A sampling-based framework for parallel data mining
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Predicting protein folding pathways
Bioinformatics
Hardware-modulated parallelism in chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Hi-index | 0.00 |
Frequent pattern mining is a fundamental data mining process which has practical applications ranging from market basket data analysis to web link analysis. In this work, we show that state-of-the-art frequent pattern mining applications are inefficient when executing on a shared memory multiprocessor system, due primarily to poor utilization of the memory hierarchy. To improve the efficiency of these applications, we explore memory performance improvements, task partitioning strategies, and task queuing models designed to maximize the scalability of pattern mining on SMP systems. Empirically, we show that the proposed strategies afford significantly improved performance. We also discuss implications of this work in light of recent trends in micro-architecture design, particularly chip multiprocessors (CMPs).