Joint estimation of confidence and error causes in speech recognition

  • Authors:
  • Atsunori Ogawa;Atsushi Nakamura

  • Affiliations:
  • NTT Communication Science Laboratories, NTT Corporation, 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan;NTT Communication Science Laboratories, NTT Corporation, 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan

  • Venue:
  • Speech Communication
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Speech recognition errors are essentially unavoidable under the severe conditions of real fields, and so confidence estimation, which scores the reliability of a recognition result, plays a critical role in the development of speech recognition based real-field application systems. However, if we are to develop an application system that provides a high-quality service, in addition to achieving accurate confidence estimation, we also need to extract and exploit further supplementary information from a speech recognition engine. As a first step in this direction, in this paper, we propose a method for estimating the confidence of a recognition result while jointly detecting the causes of recognition errors based on a discriminative model. The confidence of a recognition result and the nonexistence/existence of error causes are naturally correlated. By directly capturing these correlations between the confidence and error causes, the proposed method enhances its estimation performance for the confidence and each error cause complementarily. In the initial speech recognition experiments, the proposed method provided higher confidence estimation accuracy than a discriminative model based state-of-the-art confidence estimation method. Moreover, the effective estimation mechanism of the proposed method was confirmed by the detailed analyses.