Olex: Effective Rule Learning for Text Categorization

Authors:
Pasquale Rullo;Veronica Lucia Policicchio;Chiara Cumbo;Salvatore Iiritano
Affiliations:
University of Calabria, Rende;University of Calabria, Rende;Exeura S.r.l., Rende;Exeura S.r.l., Rende
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2009

Citing 0
Cited 7

Some DLV Applications for Knowledge Management

LPNMR '09 Proceedings of the 10th International Conference on Logic Programming and Nonmonotonic Reasoning
Exploiting ASP in Real-World Applications: Main Strengths and Challenges

LPNMR '09 Proceedings of the 10th International Conference on Logic Programming and Nonmonotonic Reasoning
ROLEX-SP: Rules of lexical syntactic patterns for free text categorization

Knowledge-Based Systems
25 years of applications of logic programming in Italy

A 25-year perspective on logic programming
ASP at work: spin-off and applications of the DLV system

Logic programming, knowledge representation, and nonmonotonic reasoning
A multi-classifier system for text categorization

Proceedings of the 2011 ACM Symposium on Research in Applied Computation
GAMoN: Discovering M-of-N{¬,∨} hypotheses for text classification by a lattice-based Genetic Algorithm

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes Olex, a novel method for the automatic induction of rule-based text classifiers. Olex supports a hypothesis language of the form "if T_{1} or \cdots or T_{n} occurs in document d, and none of T_{n + 1}, \ldots T_{n + m} occurs in d, then classify d under category c,” where each T_{i} is a conjunction of terms. The proposed method is simple and elegant. Despite this, the results of a systematic experimentation performed on the Reuters-21578, the Ohsumed, and the ODP data collections show that Olex provides classifiers that are accurate, compact, and comprehensible. A comparative analysis conducted against some of the most well-known learning algorithms (namely, Naive Bayes, Ripper, C4.5, SVM, and Linear Logistic Regression) demonstrates that it is more than competitive in terms of both predictive accuracy and efficiency.