Document Classification Using Phrases

  • Authors:
  • Jan Bakus;Mohamed S. Kamel

  • Affiliations:
  • -;-

  • Venue:
  • Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a Bayes document classifier using phrases as features.Th e phrases are extracted using a grammar that iteratively applies the rules to the sequence of words in the document. This grammar is generated from a training set using statistical word association. We report an improvement in the classification over the "bag of words" representation.