Allophone modeling for vocabulary-independent HMM recognition

  • Authors:
  • Wendy J. Holmes;Lynn C. Wood;David J. B. Pearce

  • Affiliations:
  • GEC-Marconi Limited, Hirst Research Centre, Wembley, Middlesex, UK;GEC-Marconi Limited, Hirst Research Centre, Wembley, Middlesex, UK;GEC-Marconi Limited, Hirst Research Centre, Wembley, Middlesex, UK

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the use of sub-word units based on allophones with an allophone-dependent model structure, to improve sub-word HMM recognition performance when using vocabulary-independent training. The new system is an extension of an approach based on sub-triphone units called phonicles. The original system [1] modeled major phonetic context effects, but did not take account of context effects wider than one immediately adjacent phone or the differences in duration and spectral complexity which exist between different types of phoneme. The recognition system has therefore been extended so that phoneme transcriptions are first converted to allophone transcriptions. Each allophone is then transformed to a sequence of one or more allophonicles, where different allophonicles can have different numbers of states and one allophonicle may be shared across allophones. Using a Mel cepstrum front end, isolated-word speaker-dependent recognition experiments on six application vocabularies have shown extremely good recognition performance for allophonicle models, with an average error rate of only 0.3%.