The acoustic-modeling problem in automatic speech recognition
The acoustic-modeling problem in automatic speech recognition
Phoneme HMMs constrained by frame correlations
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Hi-index | 0.00 |
This paper proposes a new hidden Markov modeling technique that uses a statistical modeling of VQ-code transitions. A bigram-constrained HMM is obtained by combining a VQ-code bigram and the conventional speaker-independent HMM. The proposed model reduces overlapping of the feature distributions between different phonemes by restricting the local VQ-code transitions. The output probabilities in the model are conditioned by the VQ-code of the previous frame. Therefore, the output probability distribution of the model changes depending on the previous frame even in the same state. A speakerdependent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of an input speaker. A speaker-independent bigram-constrained HMM is obtained using a VQ-code bigram calculated from utterances of many speakers. The model was evaluated by an 18-Japanese-consonant recognition experiment using 5240 words. The speaker-independent bigram-constrained HMM achieved an average recognition accuracy of 76.3% which is 5.5% higher than the conventional speakerindependent HMM.