Acoustic modeling of subword units for large vocabulary speaker independent speech recognition

  • Authors:
  • Chin-Hui Lee;Lawrence R. Rabiner;Roberto Pieraccini;Jay G. Wilpon

  • Affiliations:
  • AT&T Bell Laboratories, Murray Hill, NJ;AT&T Bell Laboratories, Murray Hill, NJ;AT&T Bell Laboratories, Murray Hill, NJ;AT&T Bell Laboratories, Murray Hill, NJ

  • Venue:
  • HLT '89 Proceedings of the workshop on Speech and Natural Language
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

The field of large vocabulary, continuous speech recognition has advanced to the point where there are several systems capable of attaining between 90 and 95% word accuracy for speaker independent recognition of a 1000 word vocabulary, spoken fluently for a task with a perplexity (average word branching factor) of about 60. There are several factors which account for the high performance achieved by these systems, including the use of hidden Markov models (HMM) for acoustic modeling, the use of context dependent sub-word units, the representation of between-word phonemic variation, and the use of corrective training techniques to emphasize differences between acoustically similar words in the vocabulary. In this paper we describe one of the large vocabulary speech recognition systems which is being developed at AT&T Bell Laboratories, and discuss the methods used to provide high word recognition accuracy. In particular, we focus on the techniques used to obtain acoustic models of the sub-word units (both context independent and context dependent units), and discuss the resulting system performance as a function of the type of acoustic modeling used.