Mining All Non-derivable Frequent Itemsets

Authors:
Toon Calders;Bart Goethals
Affiliations:
-;-
Venue:
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Year:
2002

Citing 10
Cited 88

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A condensed representation to find frequent patterns

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Approximation of Frequency Queris by Means of Free-Sets

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Frequent Closures as a Concise Representation for Binary Data Mining

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications

Relational Association Rules: Getting WARMeR

Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery
Dataless Transitions Between Concise Representations of Frequent Patterns

Journal of Intelligent Information Systems
Zigzag: a new algorithm for mining large inclusion dependencies in databases

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Change Profiles

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Weighted Association Rule Mining using weighted support and significance framework

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining dependence rules by finding largest itemset support quota

Proceedings of the 2004 ACM symposium on Applied computing
Reducing borders of k-disjunction free representations of frequent patterns

Proceedings of the 2004 ACM symposium on Applied computing
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Dense itemsets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Essential classification rule sets

ACM Transactions on Database Systems (TODS)
Computational complexity of itemset frequency satisfiability

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Differential constraints

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Fast and Memory Efficient Mining of Frequent Closed Itemsets

IEEE Transactions on Knowledge and Data Engineering
Summarization — Compressing Data into an Informative Representation

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Blocking Anonymity Threats Raised by Frequent Itemset Mining

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
On Characterization and Discovery of Minimal Unexpected Patterns in Rule Discovery

IEEE Transactions on Knowledge and Data Engineering
Association mining

ACM Computing Surveys (CSUR)
On the effectiveness and efficiency of computing bounds on the support of item-sets in the frequent item-sets mining problem

Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Frequency-based views to pattern collections

Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
Towards low-perturbation anonymity preserving pattern discovery

Proceedings of the 2006 ACM symposium on Applied computing
Frequent closed itemset based algorithms: a thorough structural and analytical survey

ACM SIGKDD Explorations Newsletter
Discovering significant rules

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting redundancy-aware top-k patterns

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns using probabilistic models

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
On compressing frequent patterns

Data & Knowledge Engineering
CFP-tree: A compact disk-based structure for storing and querying frequent itemsets

Information Systems
Non-derivable itemset mining

Data Mining and Knowledge Discovery
Discovering Significant Patterns

Machine Learning
Summarization – compressing data into an informative representation

Knowledge and Information Systems
The implication problem for measure-based constraints

Information Systems
Discovering shared conceptualizations in folksonomies

Web Semantics: Science, Services and Agents on the World Wide Web
Itemset frequency satisfiability: Complexity and axiomatization

Theoretical Computer Science
An efficient algorithm for mining closed inter-transaction itemsets

Data & Knowledge Engineering
Anonymity preserving pattern discovery

The VLDB Journal — The International Journal on Very Large Data Bases
A survey on algorithms for mining frequent itemsets over data streams

Knowledge and Information Systems
Adequate condensed representations of patterns

Data Mining and Knowledge Discovery
Mining conjunctive sequential patterns

Data Mining and Knowledge Discovery
MINI: Mining Informative Non-redundant Itemsets

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Minimum-Size Bases of Association Rules

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Decomposable Families of Itemsets

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
A new concise representation of frequent itemsets using generators and a positive border

Knowledge and Information Systems
Maximum entropy based significance of itemsets

Knowledge and Information Systems
Blind paraunitary equalization

Signal Processing
Mining non-derivable frequent itemsets over data stream

Data & Knowledge Engineering
Capturing truthiness: mining truth tables in binary datasets

Proceedings of the 2009 ACM symposium on Applied Computing
Non-redundant sequential rules-Theory and algorithm

Information Systems
Closures of Downward Closed Representations of Frequent Patterns

HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Principal component analysis-based frequent pattern evaluation on the object-relational data model of a cricket match database

International Journal of Data Analysis Techniques and Strategies
Non-Derivable Item Set and Non-Derivable Literal Set Representations of Patterns Admitting Negation

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
A Condensed Representation of Itemsets for Analyzing Their Evolution over Time

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Self-sufficient itemsets: An approach to screening potentially interesting associations between items

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining multidimensional and multilevel sequential patterns

ACM Transactions on Knowledge Discovery from Data (TKDD)
Frequency-based views to pattern collections

Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
About the lossless reduction of the minimal generator family of a context

ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
Discovering association rules in incomplete transactional databases

Transactions on rough sets VI
Integrating decision tree learning into inductive databases

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Frequent pattern mining and knowledge indexing based on zero-suppressed BDDs

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Mining correct properties in incomplete databases

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Summarising data by clustering items

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Output privacy in data mining

ACM Transactions on Database Systems (TODS)
Krimp: mining itemsets that compress

Data Mining and Knowledge Discovery
Reliable representations for association rules

Data & Knowledge Engineering
Theoretical bounds on the size of condensed representations

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Database transposition for constrained (closed) pattern mining

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
An automata approach to pattern collections

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Implicit enumeration of patterns

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Condensed representation of EPs and patterns quantified by frequency-based measures

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Closed non-derivable itemsets

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Summarizing frequent patterns using profiles

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
k-anonymous patterns

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Essential patterns: a perfect cover of frequent patterns

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Efficient mining of association rules based on formal concept analysis

Formal Concept Analysis
A survey on condensed representations for frequent sets

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Boolean formulas and frequent sets

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Mining succinct systems of minimal generators of formal concepts

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Quick inclusion-exclusion

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Transaction databases, frequent itemsets, and their condensed representations

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
A false negative maximal frequent itemset mining algorithm over stream

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Finding minimum representative pattern sets

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
On nested palindromes in clickstream data

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent item set mining

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Queries for data analysis

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Mining succinct and high-coverage API usage patterns from source code

Proceedings of the 10th Working Conference on Mining Software Repositories
Summarizing probabilistic frequent patterns: a fast approach

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Anytime algorithms for mining groups with maximum coverage

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Misleading Generalized Itemset discovery

Expert Systems with Applications: An International Journal
Key roles of closed sets and minimal generators in concise representations of frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent studies on frequent itemset mining algorithms resulted in significant performance improvements. However, if the minimal support threshold is set too low, or the data is highly correlated, the number of frequent itemsets itself can be prohibitively large. To overcome this problem, recently several proposals have been made to construct a concise representation of the frequent itemsets, instead of mining all frequent itemsets. The main goal of this paper is to identify redundancies in the set of all frequent itemsets and to exploit these redundancies in order to reduce the result of a mining operation. We present deduction rules to derive tight bounds on the support of candidate itemsets. We show how the deduction rules allow for constructing a minimal representation for all frequent itemsets. We also present connections between our proposal and recent proposals for concise representations and we give the results of experiments on real-life datasets that show the effectiveness of the deduction rules. In fact, the experiments even show that in many cases, first mining the concise representation, and then creating the frequent itemsets from this representation outperforms existing frequent set mining algorithms.