Influence of speech recognition errors on topic detection (poster session)

  • Authors:
  • J. Scott McCarley;Martin Franz

  • Affiliations:
  • IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY;IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY

  • Venue:
  • SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the effect of speech-recognition errors on a system for the unsupervised, nearly synchronous clustering of broadcast news stories, using the TDT (Topic Detection and Tracking) Corpora. Two questions are addressed: (1) Are speech recognition errors detrimental to the performance of the system? (2) Can a background collection of contemporaneous clean text improve performance? We investigate both the large-cluster and small-cluster limits.