A comparison of text-categorization methods applied to n-gram frequency statistics

  • Authors:
  • Helmut Berger;Dieter Merkl

  • Affiliations:
  • Faculty of Information Technology, University of Technology, Sydney, NSW, Australia;School of Computing and Information Technology, University of Western Sydney, NSW, Australia

  • Venue:
  • AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper gives an analysis of multi-class e-mail categorization performance, comparing a character n-gram document representation against a word-frequency based representation Furthermore the impact of using available e-mail specific meta-information on classification performance is explored and the findings are presented.