Phonemic HMM constrained by statistical VQ-code transition

  • Authors:
  • Satoshi Takahashi;Tatsuo Matsuoka;Kiyohiro Shikano

  • Affiliations:
  • NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new hidden Markov modeling technique that uses a statistical modeling of VQ-code transitions. A bigram-constrained HMM is obtained by combining a VQ-code bigram and the conventional speaker-independent HMM. The proposed model reduces overlapping of the feature distributions between different phonemes by restricting the local VQ-code transitions. The output probabilities in the model are conditioned by the VQ-code of the previous frame. Therefore, the output probability distribution of the model changes depending on the previous frame even in the same state. A speakerdependent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of an input speaker. A speaker-independent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of many speakers. The model was evaluated by an 18-Japanese-consonant recognition experiment using 5240 words. The speaker-independent bigram-constrained HMM achieved an average recognition accuracy of 76.3% which is 5.5% higher than the conventional speakerindependent HMM.