Ordered and Unordered Tree Inclusion
SIAM Journal on Computing
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Complexity of Planar Counting Problems
SIAM Journal on Computing
Incremental and interactive sequence mining
Proceedings of the eighth international conference on Information and knowledge management
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Introduction to Algorithms
Mining sequential patterns with constraints in large databases
Proceedings of the eleventh international conference on Information and knowledge management
The Complexity of Counting in Sparse, Regular, and Planar Graphs
SIAM Journal on Computing
Mining Sequential Patterns with Regular Expression Constraints
IEEE Transactions on Knowledge and Data Engineering
Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set
IEEE Transactions on Knowledge and Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
NP-Completeness: A Retrospective
ICALP '97 Proceedings of the 24th International Colloquium on Automata, Languages and Programming
On the Complexity of Generating Maximal Frequent and Minimal Infrequent Sets
STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Feasible itemset distributions in data mining: theory and application
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining knowledge-sharing sites for viral marketing
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning nonstationary models of normal network traffic for detecting novel attacks
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ADMIT: anomaly-based data mining for intrusions
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Plans for Customer-Class Transformation
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
MPIS: Maximal-Profit Item Selection with Cross-Selling Considerations
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
To buy or not to buy: mining airfare data to minimize ticket purchase price
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining phenotypes and informative genes from gene expression data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A cost-driven approach to role engineering
Proceedings of the 2008 ACM symposium on Applied computing
Direct mining of discriminative and essential frequent patterns via model-based search tree
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering hybrid temporal patterns from sequences consisting of point- and interval-based events
Data & Knowledge Engineering
On the Complexity of Constraint-Based Theory Extraction
DS '09 Proceedings of the 12th International Conference on Discovery Science
Tree pattern mining with tree automata constraints
Information Systems
Towards proximity pattern mining in large graphs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Counterexample explanation by anomaly detection
SPIN'12 Proceedings of the 19th international conference on Model Checking Software
Hi-index | 5.23 |
In this paper we study the complexity-theoretic aspects of mining maximal frequent patterns, from the perspective of counting the number of all distinct solutions. We present the first formal proof that the problem of counting the number of maximal frequent itemsets in a database of transactions, given an arbitrary support threshold, is #P-complete, thereby providing theoretical evidence that the problem of mining maximal frequent itemsets is NP-hard. We also extend our complexity analysis to other similar data mining problems that deal with complex data structures, such as sequences, trees, and graphs. We investigate several variants of these mining problems in which the patterns of interest are subsequences, subtrees, or subgraphs, and show that the associated problems of counting the number of maximal frequent patterns are all either #P-complete or #P-hard.