Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient parallel data mining for association rules
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
A localized algorithm for parallel association mining
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Parallel data mining for association rules on shared-memory multi-processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Memory Channel Network for PCI
IEEE Micro
Efficient Mining of Association Rules in Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Set-Oriented Mining for Association Rules in Relational Databases
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Evaluation of sampling for data mining of association rules
RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
New Algorithms for Fast Discovery of Association Rules
New Algorithms for Fast Discovery of Association Rules
High performance data mining (tutorial PM-3)
Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
PlanMine: Predicting Plan Failures Using Sequence Mining
Artificial Intelligence Review - Issues on the application of data mining
Systems support for scalable data mining
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A fast algorithm for mining sequential patterns from large databases
Journal of Computer Science and Technology
Scalable frequent-pattern mining methods: an overview
Tutorial notes of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting Algorithms for Parallel and Distributed Learning
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Scalable Parallel Data Mining for Association Rules
IEEE Transactions on Knowledge and Data Engineering
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
INDED: A Distributed Knowledge-Based Learning System
IEEE Intelligent Systems
Towards Network-Aware Data Mining
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Requirements Analysis for Parallel KDD Systems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
The Parallelization of a Knowledge Discovery System with Hypergraph Representation
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Discovering Association Rules in Large, Dense Databases
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Efficiently Mining Approximate Models of Associations in Evolving Databases
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Parallel Data Mining on Large Scale PC Cluster
WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Mining of Association Rules in Very Large Databases: A Structured Parallel Approach
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Formal Logics of Discovery and Hypothesis Formation by Machine
DS '98 Proceedings of the First International Conference on Discovery Science
Parallel and Distributed Data Mining: An Introduction
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Efficient Parallel Algorithms for Mining Associations
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
A Data-Clustering Algorithm on Distributed Memory Multiprocessors
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Parallel Sequence Mining on Shared-Memory Machines
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Active Mining in a Distributed Setting
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
InterAct: Virtual Sharing for Interactive Client-Server Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Rough sets and boolean reasoning
Granular computing
Formal logics of discovery and hypothesis formation by machine
Theoretical Computer Science
Rough sets perspective on data and knowledge
Handbook of data mining and knowledge discovery
Mining Frequent Itemsets in Distributed and Dynamic Databases
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
OP-Cluster: Clustering by Tendency in High Dimensional Space
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Association Rule Mining in Peer-to-Peer Systems
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Memory-adative association rules mining
Information Systems - Databases: Creation, management and utilization
A high-performance distributed algorithm for mining association rules
Knowledge and Information Systems
A sampling-based framework for parallel data mining
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
A generalized framework for mining spatio-temporal patterns in scientific data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
ACM SIGKDD Explorations Newsletter
Distributed Mining of Maximal Frequent Itemsets on a Data Grid System
The Journal of Supercomputing
Toward terabyte pattern mining: an architecture-conscious solution
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel mining of association rules from text databases
The Journal of Supercomputing
Learning quantifiable associations via principal sparse non-negative matrix factorization
Intelligent Data Analysis
Decentralized load balancing for highly irregular search problems
Microprocessors & Microsystems
Algorithms for clustering high dimensional and distributed data
Intelligent Data Analysis
Association-based similarity testing and its applications
Intelligent Data Analysis
On mining micro-array data by Order-Preserving Submatrix
International Journal of Bioinformatics Research and Applications
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations
Knowledge and Information Systems
Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Measures of Ruleset Quality Capable to Represent Uncertain Validity
ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Trace Mining from Distributed Assembly Databases for Causal Analysis
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Measures of ruleset quality for general rules extraction methods
International Journal of Approximate Reasoning
A load-balanced distributed parallel mining algorithm
Expert Systems with Applications: An International Journal
Optimal constraint-based decision tree induction from itemset lattices
Data Mining and Knowledge Discovery
Integrating constraint programming and itemset mining
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Improving the efficiency of FP tree construction using transactional patternbase
Proceedings of the 8th International Conference on Frontiers of Information Technology
The discovery of frequent patterns with logic and constraint programming
MAMECTIS/NOLASC/CONTROL/WAMUS'11 Proceedings of the 13th WSEAS international conference on mathematical methods, computational techniques and intelligent systems, and 10th WSEAS international conference on non-linear analysis, non-linear systems and chaos, and 7th WSEAS international conference on dynamical systems and control, and 11th WSEAS international conference on Wavelet analysis and multirate systems: recent researches in computational techniques, non-linear systems and control
Mining quantitative associations in large database
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Mining and validation of localized frequent web access patterns with dynamic tolerance
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Rough Sets and Association Rule Generation
Fundamenta Informaticae
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Scalable frequent itemset mining on many-core processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Randomly sampling maximal itemsets
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Hi-index | 0.00 |
Discovery of association rules is an important data mining task.Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms makerepeated passes over the database to determine the set of frequentitemsets (a subset of database items), thus incurringhigh I/O overhead. In the parallel case, most algorithms perform asum-reduction at the end of each pass to construct the global counts, alsoincurring high synchronization cost.In this paper we describe new parallel association mining algorithms. Thealgorithms use novel itemset clustering techniques to approximate the set ofpotentially maximal frequent itemsets. Once this set has been identified,the algorithms make use of efficient traversal techniques to generate thefrequent itemsets contained in each cluster. We propose two clusteringschemes based on equivalence classes and maximal hypergraph cliques, andstudy two lattice traversal techniques based on bottom-up and hybrid search.We use a vertical database layout to cluster related transactions together. The database is also selectively replicated so that the portion of thedatabase needed for the computation of associations is local to eachprocessor. After the initial set-up phase, the algorithms do not need anyfurther communication or synchronization. The algorithms minimize I/Ooverheads by scanning the local database portion only twice. Once in theset-up phase, and once when processing the itemset clusters. Unlike previousparallel approaches, the algorithms use simple intersection operations tocompute frequent itemsets and do not have to maintain or search complex hashstructures.Our experimental testbed is a 32-processor DEC Alpha clusterinter-connected by the Memory Channel network. We present results on theperformance of our algorithms on various databases, and compare it against awell known parallel algorithm. The best new algorithm outperforms it by anorder of magnitude.