Unified stochastic engine (USE) for speech recognition

  • Authors:
  • X. Huang;M. Belin;F. Alleva;M. Hwang

  • Affiliations:
  • School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania;School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania;School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania;School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

In most speech recognition systems, acoustic and language models are usually constructed separately, where language models are derived from a large text corpus without considering confusable acoustic data, and acoustic models are optimized without considering language model discrimination capacity. One example is that unbalanced acoustic and language models have to be combined with an ad-hoc constant language weight that is tuned from development data. This paper describes a unified stochastic engine that jointly optimizes both acoustic and language models. We develop a general modeling framework. At present, we focus only on language weight optimization, which is a special case of acoustic-driven language modeling. We report preliminary experimental results for Wall Street Journal continuous 5OOO-word speaker-independent dictation, where the error rate is reduced from 7.3% to to 6.990 with the proposed method.