Relevance models for topic detection and tracking

  • Authors:
  • Victor Lavrenko;James Allan;Edward DeGuzman;Daniel LaFlamme;Veera Pollard;Stephen Thomas

  • Affiliations:
  • University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA

  • Venue:
  • HLT '02 Proceedings of the second international conference on Human Language Technology Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We extend relevance modeling to the link detection task of Topic Detection and Tracking (TDT) and show that it substantially improves performance. Relevance modeling, a statistical language modeling technique related to query expansion, is used to enhance the topic model estimate associated with a news story, boosting the probability of words that are associated with the story even when they do not appear in the story. To apply relevance modeling to TDT, it had to be extended to work with stories rather than short queries, and the similarity comparison had to be changed to a modified form of Kullback-Leibler. We demonstrate that relevance models result in very substantial improvements over the language modeling baseline. We also show how the use of relevance modeling makes it possible to choose a single parameter for within- and cross-mode comparisons of stories.