Tandem connectionist feature extraction for conversational speech recognition

Authors:
Qifeng Zhu;Barry Chen;Nelson Morgan;Andreas Stolcke
Affiliations:
International Computer Science Institute;International Computer Science Institute;International Computer Science Institute;International Computer Science Institute
Venue:
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Year:
2004

Citing 2
Cited 5

Links Between Markov Models and Multilayer Perceptrons

IEEE Transactions on Pattern Analysis and Machine Intelligence
Connectionist speech recognition of Broadcast News

Speech Communication - Special issue on automatic transcription of broadcast news data

Robust multi-stream keyword and non-linguistic vocalization detection for computationally intelligent virtual agents

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part II
Acoustic modeling problem for automatic speech recognition system: advances and refinements (Part II)

International Journal of Speech Technology
Enhancing spontaneous speech recognition with BLSTM features

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Keyword spotting exploiting Long Short-Term Memory

Speech Communication
Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multi-Layer Perceptrons (MLPs) can be used in automatic speech recognition in many ways. A particular application of this tool over the last few years has been the Tandem approach, as described in [7] and other more recent publications. Here we discuss the characteristics of the MLP-based features used for the Tandem approach, and conclude with a report on their application to conversational speech recognition. The paper shows that MLP transformations yield variables that have regular distributions, which can be further modified by using logarithm to make the distribution easier to model by a Gaussian-HMM. Two or more vectors of these features can easily be combined without increasing the feature dimension. We also report recognition results that show that MLP features can significantly improve recognition performance for the NIST 2001 Hub-5 evaluation set with models trained on the Switchboard Corpus, even for complex systems incorporating MMIE training and other enhancements.