Reducing bandwidth for robust distributed speech recognition in conditions of packet loss

Authors:
Ronan Flynn;Edward Jones
Affiliations:
School of Engineering, Athlone Institute of Technology, Ireland;College of Engineering & Informatics, National University of Ireland, Galway, Ireland
Venue:
Speech Communication
Year:
2012

Citing 7
Cited 1

A Theory for Multiresolution Signal Decomposition: The Wavelet Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Combined speech enhancement and auditory modelling for robust distributed speech recognition

Speech Communication
A robust scheme for distributed speech recognition over loss-prone packet channels

Speech Communication
VoIP: A comprehensive survey on a promising technology

Computer Networks: The International Journal of Computer and Telecommunications Networking
Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing
Combining Media-Specific FEC and Error Concealment for Robust Distributed Speech Recognition Over Loss-Prone Packet Channels

IEEE Transactions on Multimedia
A survey of packet loss recovery techniques for streaming audio

IEEE Network: The Magazine of Global Internetworking

Low bit rate compression methods of feature vectors for distributed speech recognition

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method to reduce the bandwidth requirements for a distributed speech recognition (DSR) system, with minimal impact on recognition performance. Bandwidth reduction is achieved by applying a wavelet decomposition to feature vectors extracted from speech using an auditory-based front-end. The resulting vectors undergo vector quantisation and are then combined in pairs for transmission over a statistically modeled channel that is subject to packet burst loss. Recognition performance is evaluated in the presence of both background noise and packet loss. When there is no packet loss, results show that the proposed method can reduce the bandwidth required to 50% of the bandwidth required for the system in which the proposed method is not used, without compromising recognition performance. The bandwidth can be further reduced to 25% of the baseline for a slight decrease in recognition performance. Furthermore, in the presence of packet loss, the proposed method for bandwidth reduction, when combined with a suitable redundancy scheme, gives a 29% reduction in bandwidth, when compared to the recognition performance of an established packet loss mitigation technique.