Algorithms for clustering data
Algorithms for clustering data
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Fast sequential and parallel algorithms for association rule mining: a comparison
Fast sequential and parallel algorithms for association rule mining: a comparison
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Data mining: concepts and techniques
Data mining: concepts and techniques
Parallel data mining for association rules on shared-memory multi-processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
PARSIMONY: An infrastructure for parallel multidimensional analysis and data mining
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Parallel data mining for association rules on shared memory systems
Knowledge and Information Systems
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Strategies for Parallel Data Mining
IEEE Concurrency
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining of Association Rules in Very Large Databases: A Structured Parallel Approach
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Parallel Classification for Data Mining on Shared-Memory Multiprocessors
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Hi-index | 0.00 |
This paper presents a case study in developing an application class specific high-level interface for shared memory parallel programming. The application class we focus on is data mining. With the availability of large datasets in areas like bioinformatics, medical informatics, scientific data analysis, financial analysis, telecommunications, retailing, and marketing, data mining tasks have become an important application class for high performance computing.Our study of a number of common data mining algorithms has shown that the same set of parallelization techniques can be applied to all of them. To exploit this, we have developed a reduction-object based interface to rapidly specify a shared memory parallel data mining algorithm. The set of parallelization techniques we target include full replication, optimized full locking, and cachesensitive locking. We show how our runtime system can apply any of these technique starting from a common specification.We have evaluated our high-level interface and the parallelization techniques using apriori association mining and k-means clustering algorithms. Our experimental results show that the overhead of the interface is within 10% and our parallelization techniques scale well.