Combining Media-Specific FEC and Error Concealment for Robust Distributed Speech Recognition Over Loss-Prone Packet Channels

Authors:
A. M. Gomez;A. M. Peinado;V. Sanchez;A. J. Rubio
Affiliations:
Dept. de Teoria de la Senal, Telematica y Comunicaciones, Granada Univ.;-;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2006

Citing 0
Cited 6

A robust scheme for distributed speech recognition over loss-prone packet channels

Speech Communication
Robust distributed speech recognition in noise and packet loss conditions

Digital Signal Processing
A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels

IEEE Transactions on Audio, Speech, and Language Processing
MMSE-based packet loss concealment for CELP-coded speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Reducing bandwidth for robust distributed speech recognition in conditions of packet loss

Speech Communication
Low bit rate compression methods of feature vectors for distributed speech recognition

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a mixed recovery scheme for robust distributed speech recognition (DSR) implemented over a packet channel which suffers packet losses. The scheme combines media-specific forward error correction (FEC) and error concealment (EC). Media-specific FEC is applied at the client side, where FEC bits representing strongly quantized versions of the speech vectors are introduced. At the server side, the information provided by those FEC bits is used by the EC algorithm to improve the recognition performance. We investigate the adaptation of two different EC techniques, namely minimum mean square error (MMSE) estimation, which operates at the decoding stage, and weighted Viterbi recognition (WVR), where EC is applied at the recognition stage, in order to be used along with FEC. The experimental results show that a significant increase in recognition accuracy can be obtained with very little bandwidth increase, which may be null in practice, and a limited increase in latency, which in any case is not so critical for an application such as DSR