Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Hi-index | 0.00 |
In this paper a new efficient formulation is proposed for the generation of Frequent Item-sets that are used for computing association rules. It addresses the shortcoming of the serial Apriori Algorithm and also its parallel formulation based on count distribution (CD). It improvises on the time taken during every pass in the CD algorithm and helps improving the scalability. The scalability and efficiency of serial Apriori and also the CD algorithm is analyzed with respect to the proposed formulation. As the database size increases the efficiency of the CD algorithm decreases due to the large number of scans through the entire database. The approach addresses this problem and suggests improvement so that better scalability and efficiency are achieved. The experiments conducted show that the proposed formulation scales linearly with the number of transactions. The algorithm also has excellent scale-up properties with respect to the transaction size and the number of items in the dataset.