Reducing computational complexity and response latency through the detection of contentless frames

  • Authors:
  • R. A. Sukkar;S. M. Herman;A. R. Setlur;C. D. Mitchell

  • Affiliations:
  • Lucent Technol., Naperville, IL, USA;-;-;-

  • Venue:
  • ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 06
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a method that manipulates the decoding network to reduce both computational complexity and response latency while maintaining high ASR accuracy. The method employs a TSVQ (tree structured vector quantization) classifier that reliably discriminates between silence and non-silence frames. Reductions in computational complexity and response latency are achieved through three techniques: 1) silence skipping, 2) silence-based pruning of the dynamic programming network, and 3) early decision. Experimental results on a connected digit task and a large vocabulary company name task show that the proposed method can reduce ASR response latency by more than 82%. Furthermore, the computational complexity, measured in CPU seconds, was reduced by 13.6% on the connected digit task and 6.7% on the company name task while maintaining the recognition accuracy of the baseline system.