Speech Transcript Analysis for Automatic Search

  • Authors:
  • A. Coden

  • Affiliations:
  • -

  • Venue:
  • HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 4 - Volume 4
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of finding collateral information pertinent to a live television broadcast in real time. The solution starts with a text transcript of the broadcast generated by an automatic speech recognition system. Speaker independent speech recognition technology, even when tailored for a broadcast scenario, generally produces transcripts with relatively low accuracy. Given this limitation, we have developed algorithms that can determine the essence of the broadcast from these transcripts. Specifically, we extract named entities, topics, and sentence types from the transcript and use them to automatically generate both structured and unstructured search queries. A novel distance-ranking algorithm is used to select relevant information from the search results. The whole process is performed on-line and the query results (i.e., the collateral information) are added to the broadcast stream.