Quantization of cepstral parameters for speech recognition over the World Wide Web

Authors:
V. V. Digalakis;L. G. Neumeyer;M. Perakakis
Affiliations:
Dept. of Electron. & Comput. Eng., Tech. Univ., Heraklion;-;-
Venue:
IEEE Journal on Selected Areas in Communications
Year:
2006

Citing 0
Cited 4

A Configurable Logic Based Architecture for Real-Time ContinuousSpeech Recognition Using Hidden Markov Models

Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Entropy coding of compressed feature parameters for distributed speech recognition

Speech Communication
A DCOM-based Turkish speech recognition system: TREN – turkish recognition ENgine

ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Low bit rate compression methods of feature vectors for distributed speech recognition

Speech Communication

Quantified Score

Hi-index	0.07

Visualization

Abstract

We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web (WWW). We compare a server-only processing model where the client encodes and transmits the speech signal to the server, to a model where the recognition front end runs locally at the client and encodes and transmits the cepstral coefficients to the recognition server over the Internet. We follow a novel encoding paradigm, trying to maximize recognition performance instead of perceptual reproduction, and we find that by transmitting the cepstral coefficients we can achieve significantly higher recognition performance at a fraction of the bit rate required when encoding the speech signal directly. We find that the required bit rate to achieve the recognition performance of high-quality unquantized speech is just 2000 bits per second