Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Learning rules with negation for text categorization
Proceedings of the 2007 ACM symposium on Applied computing
Evolving Lucene search queries for text classification
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Hi-index | 0.00 |
In many application domains, there is a need for learning algorithms that generate accurate as well as comprehensible classifiers. In this paper, we present TRIPPER – a rule induction algorithm that extends RIPPER, a widely used rule-learning algorithm. TRIPPER exploits knowledge in the form of taxonomies over the values of features used to describe data. We compare the performance of TRIPPER with that of RIPPER on benchmark datasets from the Reuters 21578 corpus using WordNet (a human-generated taxonomy) to guide rule induction by TRIPPER. Our experiments show that the rules generated by TRIPPER are generally more comprehensible and compact and in the large majority of cases at least as accurate as those generated by RIPPER.