Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Reverse search for enumeration
Discrete Applied Mathematics - Special volume: first international colloquium on graphs and optimization (GOI), 1992
Fast discovery of association rules
Advances in knowledge discovery and data mining
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient discovery of error-tolerant frequent itemsets in high dimensions
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Approximate Frequent Itemsets from Noisy Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining formal concepts with a bounded number of exceptions from transactional data
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Ambiguous frequent itemset mining and polynomial delay enumeration
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A parameterizable enumeration algorithm for sequence mining
Theoretical Computer Science
Hi-index | 0.00 |
Mining frequently appearing patterns in a database is a basicproblem in informatics, especially in data mining. Particularly, whenthe input database is a collection of subsets of an itemset, the problemis called the frequent itemset mining problem, and has been extensivelystudied. In the real-world use, one of difficulties of frequent itemset miningis that real-world data is often incorrect, or missing some parts. Itcauses that some records which should include a pattern do not have it.To deal with real-world problems, one can use an ambiguous inclusionrelation and find patterns which are mostly included in many records.However, computational difficulty have prevented such problems frombeing actively used in practice. In this paper, we use an alternative inclusionrelation in which we consider an itemset P to be included in anitemset T if at most k items of P are not included in T, i.e., |P\T| ≤ k.We address the problem of enumerating frequent itemsets under thisinclusion relation and propose an efficient polynomial delay polynomialspace algorithm. Moreover, To enable us to skip many small nonvaluable frequent itemsets, we propose an algorithm for directly enumerating frequentitemsets of a certain size.