On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices

Authors:
E. Boros;V. Gurvich;L. Khachiyan;K. Makino
Affiliations:
RUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, New Jersey 08854-8003, USA e-mail: boros@rutcor.rutgers.edu;RUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, New Jersey 08854-8003, USA e-mail: gurvich@rutcor.rutgers.edu;Department of Computer Science, Rutgers University, 110 Frelinghuysen Road, Piscataway, New Jersey 08854-8019, USA e-mail: leonid@cs.rutgers.edu;Division of Systems Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, 560-8531, Japan e-mail: makino@sys.es.osaka-u.ac.jp
Venue:
Annals of Mathematics and Artificial Intelligence
Year:
2003

Citing 21
Cited 13

On generating all maximal independent sets

Information Processing Letters
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Arboricity and bipartite subgraph listing algorithms

Information Processing Letters
Identifying the Minimal Transversals of a Hypergraph and Related Problems

SIAM Journal on Computing
Complexity of identification and dualization of positive Boolean functions

Information and Computation
On the complexity of dualization of monotone disjunctive normal forms

Journal of Algorithms
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules

Advances in knowledge discovery and data mining
Data mining, hypergraph transversals, and machine learning (extended abstract)

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions

Discrete Applied Mathematics - Special issue on the satisfiability problem and Boolean functions
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Dual-Bounded Generating Problems: Partial and Multiple Transversals of a Hypergraph

SIAM Journal on Computing
On frequent sets of Boolean matrices

Annals of Mathematics and Artificial Intelligence
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering

Computational aspects of monotone dualization: A brief survey

Discrete Applied Mathematics
Scientific contributions of Leo Khachiyan (a short overview)

Discrete Applied Mathematics
Efficient Closed Pattern Mining in Strongly Accessible Set Systems (Extended Abstract)

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
A Fast and Simple Parallel Algorithm for the Monotone Duality Problem

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Efficient incremental mining of top-K frequent closed itemsets

DS'07 Proceedings of the 10th international conference on Discovery science
Left-to-Right Multiplication for Monotone Boolean Dualization

SIAM Journal on Computing
Associating structured records to text documents

Proceedings of the 21st international conference companion on World Wide Web
Discovering all associations in discrete data using frequent minimally infrequent attribute sets

Discrete Applied Mathematics
Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contexts

Information Sciences: an International Journal
Deciding monotone duality and identifying frequent itemsets in quadratic logspace

Proceedings of the 32nd symposium on Principles of database systems
The complexity of mining maximal frequent subgraphs

Proceedings of the 32nd symposium on Principles of database systems
Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining closed patterns in relational, graph and network data

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given an m×n binary matrix A, a subset C of the columns is called t-frequent if there are at least t rows in A in which all entries belonging to C are non-zero. Let us denote by α the number of maximal t-frequent sets of A, and let β denote the number of those minimal column subsets of A which are not t-frequent (so called t-infrequent sets). We prove that the inequality α⩽(m−t+1)β holds for any binary matrix A in which not all column subsets are t-frequent. This inequality is sharp, and allows for an incremental quasi-polynomial algorithm for generating all minimal t-infrequent sets. We also prove that the analogous generation problem for maximal t-frequent sets is NP-hard. Finally, we discuss the complexity of generating closed frequent sets and some other related problems.