C4.5: programs for machine learning
C4.5: programs for machine learning
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Parallel algorithms for hierarchical clustering
Parallel Computing
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Scaling up inductive learning with massive parallelism
Machine Learning
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Database management systems
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules
Data Mining and Knowledge Discovery
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Parallel Formulations of Decision-Tree Classification Algorithms
Data Mining and Knowledge Discovery
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Parallel Classification for Data Mining on Shared-Memory Multiprocessors
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Generating C4.5 production rules in parallel
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Hi-index | 0.00 |
In this article, we describe some approaches and specific techniques for scaling data mining algorithms to large data sets through parallel processing. We then analyze in more detail three core algorithms that can be scaled to large data sets: building decision trees, discovering association rules, and creating clusters.