Incorporating acoustic-phonetic knowledge in hybrid TDNN/HMM frameworks

  • Authors:
  • Christian Dugast;Laurence Devillers

  • Affiliations:
  • Philips Research Laboratories Aachen, Aachen, Germany;LIMSI, CNRS, Orsay Cedex, France

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a comparison of several architectures of Of Time-Delayed Neural Networks (TDNNs) [1] as the preprocessing step for Hidden Markov Modelling (HMM) speakerdependent continuous-speech recognition systems. We define a modular TDNN architecture on the basis of acousticphonetic knowledge, where each sub-network is trained on a different subset of phonemes. It allows us to define a hierarchical tree structure of sub-networks. This structure offers the possibility to propose a framework to enlarge the number of outputs by defining context-dependent sub-networks. We also compare different methods for integrating TDNNs in a HMM framework, a discrete and a continuous integration. For the speaker JWSA of the speaker-dependent DARPA RM1 database, with context independent phonemes, 21.3% word error rate are obtained without grammar, 4.6% with the DARPA word-pair grammar (perplexity of 60).