Temporal feature modification for retrospective categorization

  • Authors:
  • Robert Liebscher;Richard K. Belew

  • Affiliations:
  • University of California, San Diego;University of California, San Diego

  • Venue:
  • FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show that the intelligent use of one small piece of contextual information--a document's publication date--can improve the performance of classifiers trained on a text categorization task. We focus on academic research documents, where the date of publication undoubtedly has an effect on an author's choice of words. To exploit this contextual feature, we propose the technique of temporal feature modification, which takes various sources of lexical change into account, including changes in term frequency, associative strength between terms and categories, and dynamic categorization systems. We present results of classification experiments using both full text papers and abstracts of conference proceedings, showing improved classification accuracy across the whole collection, with performance increases of greater than 40% when temporal features are exploited. The technique is fast, classifier-independent, and works well even when making only a few modifications.