A Genetic Algorithm for Text Classification Rule Induction

  • Authors:
  • Adriana Pietramala;Veronica L. Policicchio;Pasquale Rullo;Inderbir Sidhu

  • Affiliations:
  • University of Calabria Rende, Italy;University of Calabria Rende, Italy;University of Calabria Rende, Italy and Exeura s.r.l. Rende, Italy;Kenetica Ltd Chicago, IL, USA

  • Venue:
  • ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a Genetic Algorithm, called Olex-GA, for the induction of rule-based text classifiers of the form "classify document dunder category cif t1ï戮驴 dor ... or tnï戮驴 dand not (tn+ 1ï戮驴 dor ... or tn+ mï戮驴 d) holds", where each tiis a term. Olex-GA relies on an efficient several-rules-per-individualbinary representation and uses the F-measure as the fitness function. The proposed approach is tested over the standard test sets Reuters-21578and Ohsumedand compared against several classification algorithms (namely, Naive Bayes, Ripper, C4.5, SVM). Experimental results demonstrate that it achieves very good performance on both data collections, showing to be competitive with (and indeed outperforming in some cases) the evaluated classifiers.