Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Scalable Parallel Data Mining for Association Rules
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining frequent item sets by opportunistic projection
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Association Analysis with One Scan of Databases
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Recurrent Items in Multimedia with Progressive Resolution Refinement
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
COFI approach for mining frequent itemsets revisited
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Index Support for Frequent Itemset Mining in a Relational DBMS
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Scrutinizing Frequent Pattern Discovery Performance
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Pattern lattice traversal by selective jumps
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Implementing leap traversals of the itemset lattice
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
ON DATA STRUCTURES FOR ASSOCIATION RULE DISCOVERY
Applied Artificial Intelligence
TCOM, an innovative data structure for mining association rules among infrequent items
Computers & Mathematics with Applications
FIUT: A new method for mining frequent itemsets
Information Sciences: an International Journal
A persistent HY-Tree to efficiently support itemset mining on large datasets
Proceedings of the 2010 ACM Symposium on Applied Computing
Efficient prime-based method for interactive mining of frequent patterns
Expert Systems with Applications: An International Journal
Programming relational databases for Itemset mining over large transactional tables
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
An efficient compression technique for frequent itemset generation in association rule mining
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Stream mining of frequent sets with limited memory
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
Existing association rule mining algorithms suffer from many problems when mining massive transactional datasets. One major problem is the high memory dependency: either the gigantic data structure built is assumed to fit in main memory, or the recursive mining process is too voracious in memory resources. Another major impediment is the repetitive and interactive nature of any knowledge discovery process. To tune parameters, many runs of the same algorithms are necessary leading to the building of these huge data structures time and again. This paper proposes a new disk-based association rule mining algorithm called Inverted Matrix, which achieves its efficiency by applying three new ideas. First, transactional data is converted into a new database layout called Inverted Matrix that prevents multiple scanning of the database during the mining phase, in which finding frequent patterns could be achieved in less than a full scan with random access. Second, for each frequent item, a relatively small independent tree is built summarizing co-occurrences. Finally, a simple and non-recursive mining process reduces the memory requirements as minimum candidacy generation and counting is needed. Experimental studies reveal that our Inverted Matrix approach outperform FP-Tree especially in mining very large transactional databases with a very large number of unique items. Our random access disk-based approach is particularly advantageous in a repetitive and interactive setting.