Incorporating acoustic-phonetic knowledge in hybrid TDNN/HMM frameworks

Authors:
Christian Dugast;Laurence Devillers
Affiliations:
Philips Research Laboratories Aachen, Aachen, Germany;LIMSI, CNRS, Orsay Cedex, France
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 2
Cited 0

Modular construction of time-delay neural networks for speech recognition

Neural Computation
Continuously variable duration hidden Markov models for automatic speech recognition

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a comparison of several architectures of Of Time-Delayed Neural Networks (TDNNs) [1] as the preprocessing step for Hidden Markov Modelling (HMM) speakerdependent continuous-speech recognition systems. We define a modular TDNN architecture on the basis of acousticphonetic knowledge, where each sub-network is trained on a different subset of phonemes. It allows us to define a hierarchical tree structure of sub-networks. This structure offers the possibility to propose a framework to enlarge the number of outputs by defining context-dependent sub-networks. We also compare different methods for integrating TDNNs in a HMM framework, a discrete and a continuous integration. For the speaker JWSA of the speaker-dependent DARPA RM1 database, with context independent phonemes, 21.3% word error rate are obtained without grammar, 4.6% with the DARPA word-pair grammar (perplexity of 60).