First story detection using a composite document representation

  • Authors:
  • Nicola Stokes;Joe Carthy

  • Affiliations:
  • University College Dublin, Ireland;University College Dublin, Ireland

  • Venue:
  • HLT '01 Proceedings of the first international conference on Human language technology research
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we explore the effects of data fusion on First Story Detection [1] in a broadcast news domain. The data fusion element of this experiment involves the combination of evidence derived from two distinct representations of document content in a single cluster run. Our composite document representation consists of a concept representation (based on the lexical chains derived from a text) and free text representation (using traditional keyword index terms). Using the TDT1 evaluation methodology we evaluate a number of document representation strategies and propose reasons why our data fusion experiment shows performance improvements in the TDT domain.