A comparison of the performance of SVM and ARNI on Text Categorization with new filtering measures on an unbalanced collection

  • Authors:
  • Elías F. Combarro;Elena Montañés;José Ranilla;Javier Fernández

  • Affiliations:
  • Computer Science Department, University of Oviedo, Gijón(Asturias), Spain;Artificial Intelligence Center, University of Oviedo, Gijón(Asturias), Spain;Artificial Intelligence Center, University of Oviedo, Gijón(Asturias), Spain;Artificial Intelligence Center, University of Oviedo, Gijón(Asturias), Spain

  • Venue:
  • IWANN '03 Proceedings of the 7th International Work-Conference on Artificial and Natural Neural Networks: Part II: Artificial Neural Nets Problem Solving Methods
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text Categorization (TC) is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task due to the great amount of information available. Machine Learning (ML) algorithms are methods recently applied with this purpose. In this paper, we compare the performance of two of these algorithms (SVM and ARNI) on a collection with an unbalanced distribution of documents into categories. Feature reduction is previously applied with both classical measures (information gain and term frequency) and 3 new measures that we propose here for first time. We also compare their performance.