Decomposable Families of Itemsets

Authors:
Nikolaj Tatti;Hannes Heikinheimo
Affiliations:
HIIT Basic Research Unit, Department of Information and Computer Science, Helsinki University of Technology, Finland;HIIT Basic Research Unit, Department of Information and Computer Science, Helsinki University of Technology, Finland
Venue:
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Year:
2008

Citing 17
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the effective implementation of the iterative proportional fitting procedure

Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
Fast discovery of association rules

Advances in knowledge discovery and data mining
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns by pattern-growth: methodology and implications

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Probabilistic Networks and Expert Systems

Probabilistic Networks and Expert Systems
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Efficient Stepwise Selection in Decomposable Models

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Practical Aspects of Efficient Forward Selection in Decomposable Graphical Models

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)

The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
A linear-time algorithm for computing the multinomial stochastic complexity

Information Processing Letters
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
The Chosen Few: On Identifying Valuable Patterns

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Computational complexity of queries based on itemsets

Information Processing Letters
Fisher information and stochastic complexity

IEEE Transactions on Information Theory

Tell me what i need to know: succinctly summarizing data with itemsets

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing data succinctly with the most informative itemsets

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Discovering descriptive tile trees: by mining optimal geometric subtiles

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of selecting a small, yet high quality subset of patterns from a larger collection of itemsets has recently attracted a lot of research. Here we discuss an approach to this problem using the notion of decomposable families of itemsets. Such itemset families define a probabilistic model for the data from which the original collection of itemsets was derived. Furthermore, they induce a special tree structure, called a junction tree, familiar from the theory of Markov Random Fields. The method has several advantages. The junction trees provide an intuitive representation of the mining results. From the computational point of view, the model provides leverage for problems that could be intractable using the entire collection of itemsets. We provide an efficient algorithm to build decomposable itemset families, and give an application example with frequency bound querying using the model. An empirical study show that our algorithm yields high quality results.