MINI: Mining Informative Non-redundant Itemsets

Authors:
Arianna Gallo;Tijl Bie;Nello Cristianini
Affiliations:
University of Bristol, Department of Engineering Mathematics, UK;University of Bristol, Department of Engineering Mathematics, UK;University of Bristol, Department of Engineering Mathematics, UK and University of Bristol, Department of Computer Science, UK
Venue:
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Year:
2007

Citing 9
Cited 12

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient mining of association rules using closed itemset lattices

Information Systems
Efficient discovery of error-tolerant frequent itemsets in high dimensions

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
Closed non-derivable itemsets

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

The Model of Most Informative Patterns and Its Application to Knowledge Extraction from Graph Databases

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
A framework for mining interesting pattern sets

Proceedings of the ACM SIGKDD Workshop on Useful Patterns
A framework for mining interesting pattern sets

ACM SIGKDD Explorations Newsletter
An information theoretic framework for data mining

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Tell me what i need to know: succinctly summarizing data with itemsets

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum entropy models and subjective interestingness: an application to tiles in binary databases

Data Mining and Knowledge Discovery
Summarizing data succinctly with the most informative itemsets

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Closed and noise-tolerant patterns in n-ary relations

Data Mining and Knowledge Discovery
Formal and computational properties of the confidence boost of association rules

ACM Transactions on Knowledge Discovery from Data (TKDD)
A statistical significance testing approach to mining the most informative set of patterns

Data Mining and Knowledge Discovery
20 years of pattern mining: a bibliometric survey

ACM SIGKDD Explorations Newsletter
Behavior-based clustering and analysis of interestingness measures for association rule mining

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent itemset mining assists the data mining practitioner in searching for strongly associated items (and transactions) in large transaction databases. Since the number of frequent itemsets is usually extremely large and unmanageable for a human user, recent works have sought to define condensed representations of them, e.g. closedor maximalfrequent itemsets. We argue that not only these methods often still fall short in sufficiently reducing of the output size, but they also output many redundant itemsets. In this paper we propose a philosophically new approach that resolves both these issues in a computationally tractable way. We present and empirically validate a statistically founded approach called MINI, to compress the set of frequent itemsets down to a list of informative and non-redundant itemsets.