Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Molecular feature mining in HIV data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Cyclic pattern kernels for predictive graph mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-K covering rule groups for gene expression data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Frequent Substructure-Based Approaches for Classifying Chemical Compounds
IEEE Transactions on Knowledge and Data Engineering
Parallel mining of closed sequential patterns
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimal assignment kernels for attributed molecular graphs
ICML '05 Proceedings of the 22nd international conference on Machine learning
Comparison of Descriptor Spaces for Chemical Compound Retrieval and Classification
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Computational aspects of mining maximal frequent patterns
Theoretical Computer Science
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct Discriminative Pattern Mining for Effective Classification
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Association Analysis Techniques for Bioinformatics Problems
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Correlated itemset mining in ROC space: a constraint programming approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised relation extraction by mining Wikipedia texts using information from the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
IEEE Computational Intelligence Magazine
Direct mining of discriminative patterns for classifying uncertain data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A new algorithm for mining frequent connected subgraphs based on adjacency matrices
Intelligent Data Analysis
Full duplicate candidate pruning for frequent connected subgraph mining
Integrated Computer-Aided Engineering
Constructing classification features using minimal predictive patterns
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
DESSIN: mining dense subgraph patterns in a single graph
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
A concise representation of association rules using minimal predictive rules
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
NDPMine: efficiently mining discriminative numerical features for pattern-based classification
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Fast, effective molecular feature mining by local optimization
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
An approach for adaptive associative classification
Expert Systems with Applications: An International Journal
Itemset mining: A constraint programming perspective
Artificial Intelligence
Authorship classification: a discriminative syntactic tree mining approach
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Using constraints to generate and explore higher order discriminative patterns
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Direct local pattern sampling by efficient two-step random procedures
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Dual active feature and sample selection for graph classification
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Compositional object pattern: a new model for album event recognition
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Fast mining erasable itemsets using NC_sets
Expert Systems with Applications: An International Journal
Top-k interesting phrase mining in ad-hoc collections using sequence pattern indexing
Proceedings of the 15th International Conference on Extending Database Technology
I-prune: Item selection for associative classification
International Journal of Intelligent Systems
Semi-supervised feature selection using co-occurrent frequent subgraphs
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Efficient mining of top-k breaker emerging subgraph patterns from graph datasets
AusDM '09 Proceedings of the Eighth Australasian Data Mining Conference - Volume 101
A direct mining approach to efficient constrained graph pattern discovery
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Exploring discriminative pose sub-patterns for effective action classification
Proceedings of the 21st ACM international conference on Multimedia
A temporal pattern mining approach for classifying electronic health record data
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
MEI: An efficient algorithm for mining erasable itemsets
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Frequent patterns provide solutions to datasets that do not have well-structured feature vectors. However, frequent pattern mining is non-trivial since the number of unique patterns is exponential but many are non-discriminative and correlated. Currently, frequent pattern mining is performed in two sequential steps: enumerating a set of frequent patterns, followed by feature selection. Although many methods have been proposed in the past few years on how to perform each separate step efficiently, there is still limited success in eventually finding highly compact and discriminative patterns. The culprit is due to the inherent nature of this widely adopted two-step approach. This paper discusses these problems and proposes a new and different method. It builds a decision tree that partitions the data onto different nodes. Then at each node, it directly discovers a discriminative pattern to further divide its examples into purer subsets. Since the number of examples towards leaf level is relatively small, the new approach is able to examine patterns with extremely low global support that could not be enumerated on the whole dataset by the two-step method. The discovered feature vectors are more accurate on some of the most difficult graph as well as frequent itemset problems than most recently proposed algorithms but the total size is typically 50% or more smaller. Importantly, the minimum support of some discriminative patterns can be extremely low (e.g. 0.03%). In order to enumerate these low support patterns, state-of-the-art frequent pattern algorithm either cannot finish due to huge memory consumption or have to enumerate 101 to 103 times more patterns before they can even be found. Software and datasets are available by contacting the author.