Context-based methods for text categorisation

  • Authors:
  • D. S. Hunnisett;W. J. Teahan

  • Affiliations:
  • CORData, Parc Meni, Bangor;University of Wales, Bangor

  • Venue:
  • Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2004
  • Text categorization for streams

    SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose several context-based methods for text categorization. One method, a small modification to the PPM compression-based model which is known to significantly degrade compression performance, counter-intuitively has the opposite effect on categorization performance. Another method, called C-measure, simply counts the presence of higher order character contexts, and outperforms all other approaches investigated.