Integrating Background Knowledge into Nearest-Neighbor Text Classification

  • Authors:
  • Sarah Zelikovitz;Haym Hirsh

  • Affiliations:
  • -;-

  • Venue:
  • ECCBR '02 Proceedings of the 6th European Conference on Advances in Case-Based Reasoning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes two different approaches for incorporating background knowledge into nearest-neighbor text classification. Our first approach uses background text to assess the similarity between training and test documents rather than assessing their similarity directly. The second method redescribes examples using Latent Semantic Indexing on the background knowledge, assessing document similarities in this redescribed space. Our experimental results showthat both approaches can improve the performance of nearest-neighbor text classification. These methods are especially useful when labeling text is a labor-intensive job and when there is a large amount of information available about a specific problem on the World Wide Web.