C4.5: programs for machine learning
C4.5: programs for machine learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Mining the most interesting rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining features for sequence classification
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
A tree projection algorithm for generation of frequent item sets
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
On feature distributional clustering for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Using conjunction of attribute values for classification
Proceedings of the eleventh international conference on Information and knowledge management
ECML '93 Proceedings of the European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
DeEPs: A New Instance-Based Lazy Discovery and Classification System
Machine Learning
FARMER: finding interesting rule groups in microarray datasets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient closed pattern mining in the presence of tough block constraints
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-K covering rule groups for gene expression data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Efficient itemset generator discovery over a stream sliding window
Proceedings of the 18th ACM conference on Information and knowledge management
Direct mining of discriminative patterns for classifying uncertain data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Input space reduction for rule based classification
WSEAS Transactions on Information Science and Applications
Efficient computation of measurements of correlated patterns in uncertain data
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Classification based on specific rules and inexact coverage
Expert Systems with Applications: An International Journal
Top-k interesting phrase mining in ad-hoc collections using sequence pattern indexing
Proceedings of the 15th International Conference on Extending Database Technology
I-prune: Item selection for associative classification
International Journal of Intelligent Systems
Editorial: Parameter-free classification in multi-class imbalanced data sets
Data & Knowledge Engineering
CAR-NF: A classifier based on specific rules with high netconf
Intelligent Data Analysis
Key roles of closed sets and minimal generators in concise representations of frequent patterns
Intelligent Data Analysis
Hi-index | 0.00 |
Many studies have shown that rule-based classifiers perform well in classifying categorical and sparse high-dimensional databases. However, a fundamental limitation with many rule-based classifiers is that they find the rules by employing various heuristic methods to prune the search space and select the rules based on the sequential database covering paradigm. As a result, the final set of rules that they use may not be the globally best rules for some instances in the training database. To make matters worse, these algorithms fail to fully exploit some more effective search space pruning methods in order to scale to large databases. In this paper, we present a new classifier, HARMONY, which directly mines the final set of classification rules. HARMONY uses an instance-centric rule-generation approach and it can assure that, for each training instance, one of the highest-confidence rules covering this instance is included in the final rule set, which helps in improving the overall accuracy of the classifier. By introducing several novel search strategies and pruning methods into the rule discovery process, HARMONY also has high efficiency and good scalability. Our thorough performance study with some large text and categorical databases has shown that HARMONY outperforms many well-known classifiers in terms of both accuracy and computational efficiency and scales well with regard to the database size.