Summarization as feature selection for text categorization

  • Authors:
  • Aleksander Kolcz;Vidya Prabakarmurthi;Jugal Kalita

  • Affiliations:
  • Personalogy, Inc., Colorado Springs, CO;University of Colorado at Colorado Springs, Colorado Springs, CO;University of Colorado at Colorado Springs, Colorado Springs, CO

  • Venue:
  • Proceedings of the tenth international conference on Information and knowledge management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of evaluating the effectiveness of summarization techniques for the task of document categorization. It is argued that for a large class of automatic categorization algorithms, extraction-based document categorization can be viewed as a particular form of feature selection performed on the full text of the document and, in this context, its impact can be compared with state-of-the-art feature selection techniques especially devised to provide good categorization performance. Such a framework provides for a better assessment of the expected performance of a categorizer if the compression rate of the summarizer is known.