Improving information extraction by modeling errors in speech recognizer output

  • Authors:
  • David D. Palmer;Mari Ostendorf

  • Affiliations:
  • The MITRE Corporation, Bedford, MA;University of Washington, Seattle, WA

  • Venue:
  • HLT '01 Proceedings of the first international conference on Human language technology research
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe a technique for improving the performance of an information extraction system for speech data by explicitly modeling the errors in the recognizer output. The approach combines a statistical model of named entity states with a lattice representation of hypothesized words and errors annotated with recognition confidence scores. Additional refinements include the use of multiple error types, improved confidence estimation, and multipass processing. In combination, these techniques improve named entity recognition performance over a text-based baseline by 28%.