Transaction databases, frequent itemsets, and their condensed representations

Authors:
Taneli Mielikäinen
Affiliations:
HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
Venue:
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Year:
2005

Citing 37
Cited 0

Monte-Carlo approximation algorithms for enumeration problems

Journal of Algorithms
Computational learning theory: an introduction

Computational learning theory: an introduction
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A database perspective on knowledge discovery

Communications of the ACM
On limited nondeterminism and the complexity of the V-C dimension

Journal of Computer and System Sciences
Fast discovery of association rules

Advances in knowledge discovery and data mining
Inductive databases and condensed representations for data mining (extended abstract)

ILPS '97 Proceedings of the 1997 international symposium on Logic programming
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
The discrepancy method: randomness and complexity

The discrepancy method: randomness and complexity
New results on monotone dualization and generating hypergraph transversals

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Frequent Closures as a Concise Representation for Binary Data Mining

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
On the Complexity of Generating Maximal Frequent and Minimal Infrequent Sets

STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
A perspective on inductive databases

ACM SIGKDD Explorations Newsletter
Feasible itemset distributions in data mining: theory and application

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Discovering all most specific sentences

ACM Transactions on Database Systems (TODS)
On Computing Condensed Frequent Pattern Bases

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The complexity of mining maximal frequent itemsets and maximal frequent patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Computational complexity of itemset frequency satisfiability

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Knowledge Discovery in Inductive Databases: Third International Workshop, KDID 2004, Pisa, Italy, September 20, 2004, Revised Selected and Invited Papers (Lecture Notes in Computer Science)

Knowledge Discovery in Inductive Databases: Third International Workshop, KDID 2004, Pisa, Italy, September 20, 2004, Revised Selected and Invited Papers (Lecture Notes in Computer Science)
Database Support for Data Mining Applications: Discovering Knowledge with Inductive Queries (Lecture Notes in Computer Science)

Database Support for Data Mining Applications: Discovering Knowledge with Inductive Queries (Lecture Notes in Computer Science)
Extremal Combinatorics: With Applications in Computer Science

Extremal Combinatorics: With Applications in Computer Science
Theoretical bounds on the size of condensed representations

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
An automata approach to pattern collections

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Implicit enumeration of patterns

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Parameterized Complexity

Parameterized Complexity

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent itemsets is a fundamental task in data mining. Unfortunately the number of frequent itemsets describing the data is often too large to comprehend. This problem has been attacked by condensed representations of frequent itemsets that are subcollections of frequent itemsets containing only the frequent itemsets that cannot be deduced from other frequent itemsets in the subcollection, using some deduction rules. In this paper we review the most popular condensed representations of frequent itemsets, study their relationship to transaction databases and each other, examine their combinatorial and computational complexity, and describe their relationship to other important concepts in combinatorial data analysis, such as Vapnik-Chervonenkis dimension and hypergraph transversals.