Fragments and text categorization

  • Authors:
  • Jan Blaták;Eva Mráková;Luboš Popelínský

  • Affiliations:
  • Masaryk University, Czech Republic;Masaryk University, Czech Republic;Masaryk University, Czech Republic

  • Venue:
  • ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce two novel methods of text categorization in which documents are split into fragments. We conducted experiments on English, French and Czech. In all cases, the problems referred to a binary document classification. We find that both methods increase the accuracy of text categorization. For the Naïve Bayes classifier this increase is significant.