Information gain ratio as term weight: the case of summarization of IR results

  • Authors:
  • Tatsunori Mori

  • Affiliations:
  • Yokohama National University, Yokohama, Japan

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new term weighting method for summarizing documents retrieved by IR system. Unlike query-biased summarization, our method utilizes not the information of query, but the similarity information among original documents by hierarchical clustering. To map the similarity structure of the clusters into the weight of each word, we adopt the information gain ratio of probabilistic distribution of each word as term weight.