A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Modern Information Retrieval
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Combinatorial Algorithms: For Computers and Hard Calculators
Combinatorial Algorithms: For Computers and Hard Calculators
Effect of Data Distribution in Parallel Mining of Associations
Data Mining and Knowledge Discovery
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Efficient Mining of Association Rules in Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Feature Selection via Discretization
IEEE Transactions on Knowledge and Data Engineering
A New Approach to Online Generation of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Mining Associations with the Collective Strength Approach
IEEE Transactions on Knowledge and Data Engineering
Finding Localized Associations in Market Basket Data
IEEE Transactions on Knowledge and Data Engineering
Redefining Clustering for High-Dimensional Applications
IEEE Transactions on Knowledge and Data Engineering
A study of object declustering strategies in parallel temporal object database systems
Information Sciences—Applications: An International Journal
The Idea of De-Clustering and its Applications
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Multidimensional Declustering Schemes Using Golden Ratio and Kronecker Sequences
IEEE Transactions on Knowledge and Data Engineering
Some complexity results for the Traveling Salesman Problem
STOC '76 Proceedings of the eighth annual ACM symposium on Theory of computing
(Almost) Optimal parallel block access for range queries
Information Sciences—Informatics and Computer Science: An International Journal
Communication-Efficient Distributed Mining of Association Rules
Data Mining and Knowledge Discovery
Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data
IEEE Transactions on Knowledge and Data Engineering
A high-performance distributed algorithm for mining association rules
Knowledge and Information Systems
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
Distributed Data Mining in Peer-to-Peer Networks
IEEE Internet Computing
Mining maximal hyperclique pattern: A hybrid search strategy
Information Sciences: an International Journal
Information Sciences: an International Journal
Reversible steganographic method using SMVQ approach based on declustering
Information Sciences: an International Journal
An efficient algorithm for mining frequent inter-transaction patterns
Information Sciences: an International Journal
Exploratory mining in cube space
Data Mining and Knowledge Discovery
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
Discovery of maximum length frequent itemsets
Information Sciences: an International Journal
Analysis and Comparison of Replicated Declustering Schemes
IEEE Transactions on Parallel and Distributed Systems
On discovery of soft associations with "most" fuzzy quantifier for item promotion applications
Information Sciences: an International Journal
Multi-Site Retrieval of Declustered Data
ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
Efficient single-pass frequent pattern mining using a prefix-tree
Information Sciences: an International Journal
Top-down mining of frequent closed patterns from very high dimensional data
Information Sciences: an International Journal
FIUT: A new method for mining frequent itemsets
Information Sciences: an International Journal
Sliding window-based frequent pattern mining over data streams
Information Sciences: an International Journal
An algorithm to mine general association rules from tabular data
Information Sciences: an International Journal
RMAIN: Association rules maintenance without reruns through data
Information Sciences: an International Journal
Association rule mining in peer-to-peer systems
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Parallel and distributed methods for incremental frequent itemset mining
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
High utility pattern mining using the maximal itemset property and lexicographic tree structures
Information Sciences: an International Journal
Mining numerical association rules via multi-objective genetic algorithms
Information Sciences: an International Journal
Core set analysis in inconsistent decision tables
Information Sciences: an International Journal
Hi-index | 0.07 |
Existing parallel algorithms for association rule mining have a large inter-site communication cost or require a large amount of space to maintain the local support counts of a large number of candidate sets. This study proposes a de-clustering approach for distributed architectures, which eliminates the inter-site communication cost, for most of the influential association rule mining algorithms. To de-cluster the database into similar partitions, an efficient algorithm is developed to approximate the shortest spanning path (SSP) to link transaction data together. The SSP obtained is then used to evenly de-cluster the transaction data into subgroups. The proposed approach guarantees that all subgroups are similar to each other and to the original group. Experiment results show that data size and the number of items are the only two factors that determine the performance of de-clustering. Additionally, based on the approach, most of the influential association rule mining algorithms can be implemented in a distributed architecture to obtain a drastic increase in speed without losing any frequent itemsets. Furthermore, the data distribution in each de-clustered participant is almost the same as that of a single site, which implies that the proposed approach can be regarded as a sampling method for distributed association rule mining. Finally, the experiment results prove that the original inadequate mining results can be improved to an almost perfect level.