Towards realizable, low-cost broadcast systems for dynamic environments
IEEE/ACM Transactions on Networking (TON)
Hi-index | 0.00 |
In this paper, we present a method that manipulates the decoding network to reduce both computational complexity and response latency while maintaining high ASR accuracy. The method employs a TSVQ (tree structured vector quantization) classifier that reliably discriminates between silence and non-silence frames. Reductions in computational complexity and response latency are achieved through three techniques: 1) silence skipping, 2) silence-based pruning of the dynamic programming network, and 3) early decision. Experimental results on a connected digit task and a large vocabulary company name task show that the proposed method can reduce ASR response latency by more than 82%. Furthermore, the computational complexity, measured in CPU seconds, was reduced by 13.6% on the connected digit task and 6.7% on the company name task while maintaining the recognition accuracy of the baseline system.