Non-derivable itemset mining

Authors:
Toon Calders;Bart Goethals
Affiliations:
Eindhoven Technical University, Eindhoven, The Netherlands and University of Antwerp, Antwerp, Belgium;University of Antwerp, Antwerp, Belgium
Venue:
Data Mining and Knowledge Discovery
Year:
2007

Citing 19
Cited 37

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A condensed representation to find frequent patterns

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Support Approximations Using Bonferroni-Type Inequalities

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Why to Apply Generalized Disjunction-Free Generators Representation of Frequent Patterns?

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Concise Representation of Frequent Patterns Based on Generalized Disjunction-Free Generators

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
DBC: a condensed representation of frequent patterns for efficient mining

Information Systems
Advances in frequent itemset mining implementations: report on FIMI'03

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining condensed frequent-pattern bases

Knowledge and Information Systems
Discovering frequent itemsets in the presence of highly frequent items

INAP'01 Proceedings of the Applications of prolog 14th international conference on Web knowledge management and decision support
Theoretical bounds on the size of condensed representations

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Quick inclusion-exclusion

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Itemset frequency satisfiability: Complexity and axiomatization

Theoretical Computer Science
Index-BitTableFI: An improved algorithm for mining frequent itemsets

Knowledge-Based Systems
Mining conjunctive sequential patterns

Data Mining and Knowledge Discovery
Effective and efficient itemset pattern summarization: regression-based approaches

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Post-processing of associative classification rules using closed sets

Expert Systems with Applications: An International Journal
Mining non-derivable frequent itemsets over data stream

Data & Knowledge Engineering
Tell me something I don't know: randomization strategies for iterative data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cartesian contour: a concise representation for a collection of frequent sets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Closures of Downward Closed Representations of Frequent Patterns

HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Sweeping the disjunctive search space towards mining new exact concise representations of frequent itemsets

Data & Knowledge Engineering
A new generic basis of "factual" and "implicative" association rules

Intelligent Data Analysis
Closed Non Derivable Data Cubes Based on Non Derivable Minimal Generators

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Non-Derivable Item Set and Non-Derivable Literal Set Representations of Patterns Admitting Negation

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Compressed disjunction-free pattern representation versus essential pattern representation

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Generic association rule bases: are they so succinct?

CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
Margin-closed frequent sequential pattern mining

Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Block interaction: a generative summarization scheme for frequent patterns

Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Frequent regular itemset mining

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A fast pruning redundant rule method using Galois connection

Applied Soft Computing
Mining non-redundant information-theoretic dependencies between itemsets

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
ESTATE: strategy for exploring labeled spatial datasets using association analysis

DS'10 Proceedings of the 13th international conference on Discovery science
Reliable representations for association rules

Data & Knowledge Engineering
Summarizing transactional databases with overlapped hyperrectangles

Data Mining and Knowledge Discovery
Tell me what i need to know: succinctly summarizing data with itemsets

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing frequent itemsets via pignistic transformation

EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
A dynamic layout of sliding window for frequent itemset mining over data streams

Journal of Systems and Software
Summarizing data succinctly with the most informative itemsets

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Summarizing categorical data by clustering attributes

Data Mining and Knowledge Discovery
Constrained Cube Lattices for Multidimensional Database Mining

International Journal of Data Warehousing and Mining
Randomly sampling maximal itemsets

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs

ACM Transactions on Knowledge Discovery from Data (TKDD)
A statistical significance testing approach to mining the most informative set of patterns

Data Mining and Knowledge Discovery
YmalDB: exploring relational databases via result-driven recommendations

The VLDB Journal — The International Journal on Very Large Data Bases
Interesting pattern mining in multi-relational data

Data Mining and Knowledge Discovery
20 years of pattern mining: a bibliometric survey

ACM SIGKDD Explorations Newsletter
Key roles of closed sets and minimal generators in concise representations of frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

All frequent itemset mining algorithms rely heavily on the monotonicity principle for pruning. This principle allows for excluding candidate itemsets from the expensive counting phase. In this paper, we present sound and complete deduction rules to derive bounds on the support of an itemset. Based on these deduction rules, we construct a condensed representation of all frequent itemsets, by removing those itemsets for which the support can be derived, resulting in the so called Non-Derivable Itemsets (NDI) representation. We also present connections between our proposal and recent other proposals for condensed representations of frequent itemsets. Experiments on real-life datasets show the effectiveness of the NDI representation, making the search for frequent non-derivable itemsets a useful and tractable alternative to mining all frequent itemsets.