Phonemic HMM constrained by statistical VQ-code transition

Authors:
Satoshi Takahashi;Tatsuo Matsuoka;Kiyohiro Shikano
Affiliations:
NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 1
Cited 1

The acoustic-modeling problem in automatic speech recognition

The acoustic-modeling problem in automatic speech recognition

Phoneme HMMs constrained by frame correlations

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new hidden Markov modeling technique that uses a statistical modeling of VQ-code transitions. A bigram-constrained HMM is obtained by combining a VQ-code bigram and the conventional speaker-independent HMM. The proposed model reduces overlapping of the feature distributions between different phonemes by restricting the local VQ-code transitions. The output probabilities in the model are conditioned by the VQ-code of the previous frame. Therefore, the output probability distribution of the model changes depending on the previous frame even in the same state. A speakerdependent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of an input speaker. A speaker-independent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of many speakers. The model was evaluated by an 18-Japanese-consonant recognition experiment using 5240 words. The speaker-independent bigram-constrained HMM achieved an average recognition accuracy of 76.3% which is 5.5% higher than the conventional speakerindependent HMM.