Birthday paradox, coupon collectors, caching algorithms and self-organizing search
Discrete Applied Mathematics
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
Information Visualization and Visual Data Mining
IEEE Transactions on Visualization and Computer Graphics
Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Subgroup Discovery with CN2-SD
The Journal of Machine Learning Research
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Interestingness measures for data mining: A survey
ACM Computing Surveys (CSUR)
Data Mining and Knowledge Discovery
ORIGAMI: A Novel and Effective Approach for Mining Representative Orthogonal Graph Patterns
Statistical Analysis and Data Mining
Output space sampling for graph patterns
Proceedings of the VLDB Endowment
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
MIME: a framework for interactive visual pattern mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Linear space direct pattern sampling using coupling from the past
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Hi-index | 0.00 |
Pattern mining techniques generally enumerate lots of uninteresting and redundant patterns. To obtain less redundant collections, techniques exist that give condensed representations of these collections. However, the proposed techniques often rely on complete enumeration of the pattern space, which can be prohibitive in terms of time and memory. Sampling can be used to filter the output space of patterns without explicit enumeration. We propose a framework for random sampling of maximal itemsets from transactional databases. The presented framework can use any monotonically decreasing measure as interestingness criteria for this purpose. Moreover, we use an approximation measure to guide the search for maximal sets to different parts of the output space. We show in our experiments that the method can rapidly generate small collections of patterns with good quality. The sampling framework has been implemented in the interactive visual data mining tool called MIME1, as such enabling users to quickly sample a collection of patterns and analyze the results.