Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Complexity of Planar Counting Problems
SIAM Journal on Computing
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
The Complexity of Counting in Sparse, Regular, and Planar Graphs
SIAM Journal on Computing
SPADE: An Efficient Algorithm for Mining Frequent Sequences
Machine Learning
Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set
IEEE Transactions on Knowledge and Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
NP-Completeness: A Retrospective
ICALP '97 Proceedings of the 24th International Colloquium on Automata, Languages and Programming
On the Complexity of Generating Maximal Frequent and Minimal Infrequent Sets
STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Feasible itemset distributions in data mining: theory and application
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A fuzzy data mining algorithm for incremental mining of quantitative sequential patterns
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Discovering Frequent Closed Partial Orders from Strings
IEEE Transactions on Knowledge and Data Engineering
Towards multidimensional subspace skyline analysis
ACM Transactions on Database Systems (TODS)
Boosting-based parse reranking with subtree features
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Static specification inference using predicate mining
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Discovering frequent sets from data streams with CPU constraint
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
A Linear Delay Algorithm for Building Concept Lattices
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
User Assisted Substructure Extraction in Molecular Data Mining
MDA '08 Proceedings of the 3rd international conference on Advances in Mass Data Analysis of Images and Signals in Medicine, Biotechnology, Chemistry and Food Industry
Efficient algorithms for incremental maintenance of closed sequential patterns in large databases
Data & Knowledge Engineering
Estimating the number of frequent itemsets in a large database
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A framework for mining top-k frequent closed itemsets using order preserving generators
Proceedings of the 2nd Bangalore Annual Compute Conference
A Randomness Based Analysis on the Data Size Needed for Removing Deceptive Patterns
IEICE - Transactions on Information and Systems
Models for association rules based on clustering and correlation
Intelligent Data Analysis
A Pattern Mining Approach Using QVT
ECMDA-FA '09 Proceedings of the 5th European Conference on Model Driven Architecture - Foundations and Applications
Towards efficient dominant relationship exploration of the product items on the web
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Time and space efficient discovery of maximal geometric graphs
DS'07 Proceedings of the 10th international conference on Discovery science
Efficient incremental mining of top-K frequent closed itemsets
DS'07 Proceedings of the 10th international conference on Discovery science
Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
MARGIN: Maximal frequent subgraph mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
Fun at a department store: data mining meets switching theory
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Detecting missing method calls in object-oriented software
ECOOP'10 Proceedings of the 24th European conference on Object-oriented programming
ρ-uncertainty: inference-proof transaction anonymization
Proceedings of the VLDB Endowment
Making interval-based clustering rank-aware
Proceedings of the 14th International Conference on Extending Database Technology
Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents
ACM Transactions on Information Systems (TOIS)
Adaptive load shedding for mining frequent patterns from data streams
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
The parameterized complexity of enumerating frequent itemsets
IWPEC'06 Proceedings of the Second international conference on Parameterized and Exact Computation
Transaction databases, frequent itemsets, and their condensed representations
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
A false negative maximal frequent itemset mining algorithm over stream
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Counterexample explanation by anomaly detection
SPIN'12 Proceedings of the 19th international conference on Model Checking Software
An adaptive algorithm for finding frequent sets in landmark windows
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Mining frequent subgraphs over uncertain graph databases under probabilistic semantics
The VLDB Journal — The International Journal on Very Large Data Bases
Detecting missing method calls as violations of the majority rule
ACM Transactions on Software Engineering and Methodology (TOSEM)
The complexity of mining maximal frequent subgraphs
Proceedings of the 32nd symposium on Principles of database systems
Non-linear book manifolds: learning from associations the dynamic geometry of digital libraries
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
MFIBlocks: An effective blocking algorithm for entity resolution
Information Systems
Mining-based compression approach of propositional formulae
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Evaluation of RDF queries via equivalence
Frontiers of Computer Science: Selected Publications from Chinese Universities
Over-Fitting and Error Detection for Online Role Mining
International Journal of Web Services Research
Mining frequent itemsets in data streams within a time horizon
Data & Knowledge Engineering
A time-efficient breadth-first level-wise lattice-traversal algorithm to discover rare itemsets
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Mining maximal frequent itemsets is one of the most fundamental problems in data mining. In this paper we study the complexity-theoretic aspects of maximal frequent itemset mining, from the perspective of counting the number of solutions. We present the first formal proof that the problem of counting the number of distinct maximal frequent itemsets in a database of transactions, given an arbitrary support threshold, is #P-complete, thereby providing strong theoretical evidence that the problem of mining maximal frequent itemsets is NP-hard. This result is of particular interest since the associated decision problem of checking the existence of a maximal frequent itemset is in P.We also extend our complexity analysis to other similar data mining problems dealing with complex data structures, such as sequences, trees, and graphs, which have attracted intensive research interests in recent years. Normally, in these problems a partial order among frequent patterns can be defined in such a way as to preserve the downward closure property, with maximal frequent patterns being those without any successor with respect to this partial order. We investigate several variants of these mining problems in which the patterns of interest are subsequences, subtrees, or subgraphs, and show that the associated problems of counting the number of maximal frequent patterns are all either #P-complete or #P-hard.