Implementation aspects of large vocabulary recognition based on intraword and interword phonetic units

  • Authors:
  • R. Pieraccini;C. H. Lee;E. Giachin;L. R. Rabiner

  • Affiliations:
  • -;-;-;-

  • Venue:
  • HLT '90 Proceedings of the workshop on Speech and Natural Language
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most large vocabulary speech recognition systems essentially consist of a training algorithm and a recognition structure which is essentially a search for the best path through a rather large decoding network. Although the performance of the recognizer is crucially tied to the details of the training procedure, it is absolutely essential that the recognition structure be efficient in terms of computation and memory, and accurate in terms of actually determining the best path through the lattice, so that a wide range of training (sub-word unit creation) strategies can be efficiently evaluated in a reasonable time period. We have considered an architecture in which we incorporate several well known procedures (beam search, compiled network, etc.) with some new ideas (stacks of active network nodes, likelihood computation on demand, guided search, etc.) to implement a search procedure which maintains the accuracy of the full search but which can decode a single sentence in about one minute of computing time (about 20 times real time) on a vectorized, concurrent processor. The ways in which we have realized this significant computational reduction are described in this paper.