Classification of polish email messages: experiments with various data representations

  • Authors:
  • Jerzy Stefanowski;Marcin Zienkowicz

  • Affiliations:
  • Institute of Computing Science, Poznań University of Technology, Poznań, Poland;Institute of Computing Science, Poznań University of Technology, Poznań, Poland

  • Venue:
  • ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machine classification of Polish language emails into user-specific folders is considered. We experimentally evaluate the impact of different approaches to construct data representation of emails on the accuracy of classifiers. Our results show that language processing techniques have smaller influence than an appropriate selection of features, in particular ones coming from the email header or its attachments.