Improved acoustic modeling for continuous speech recognition

Authors:
C.-H. Lee;E. Giachin;L. R. Rabiner;R. Pieraccini;A. E. Rosenberg
Affiliations:
-;-;-;-;-
Venue:
HLT '90 Proceedings of the workshop on Speech and Natural Language
Year:
1990

Citing 4
Cited 11

The MIT SUMMIT Speech Recognition system: a progress report

HLT '89 Proceedings of the workshop on Speech and Natural Language
Implementation aspects of large vocabulary recognition based on intraword and interword phonetic units

HLT '90 Proceedings of the workshop on Speech and Natural Language
Automatic Speech Recognition: The Development of the Sphinx Recognition System

Automatic Speech Recognition: The Development of the Sphinx Recognition System
A study on speaker adaptation of the parameters of continuousdensity hidden Markov models

IEEE Transactions on Signal Processing

Bayesian learning of Gaussian mixture densities for hidden Markov models

HLT '91 Proceedings of the workshop on Speech and Natural Language
A study on speaker-adaptive speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language
DARPA resource management benchmark test results June 1990

HLT '90 Proceedings of the workshop on Speech and Natural Language
Implementation aspects of large vocabulary recognition based on intraword and interword phonetic units

HLT '90 Proceedings of the workshop on Speech and Natural Language
Factorization of language constraints in speech recognition

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Subphonetic modeling for speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language
MAP estimation of continuous density HMM: theory and applications

HLT '91 Proceedings of the workshop on Speech and Natural Language
Subphonetic modeling with Markov states: senone

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Improved acoustic modeling with Bayesian learning

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Discriminative analysis for feature reduction in automatic speech recognition

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Predicting unseen triphones with senones

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report on some recent improvements to an HMM-based, continuous speech recognition system which is being developed at AT&T Bell Laboratories. These advances, which include the incorporation of inter-word, context-dependent units and an improved feature analysis, lead to a recognition system which achieve better than 95% word accuracy for speaker independent recognition of the 1000-word, DARPA resource management task using the standard word-pair grammar (with a perplexity of about 60). It will be shown that the incorporation of inter-word units into training results in better acoustic models of word juncture coarticulation and gives a 20% reduction in error rate. The effect of an improved set of spectral and log energy features is to further reduce word error rate by about 30%. We also found that the spectral vectors, corresponding to the same speech unit, behave differently statistically, depending on whether they are at word boundaries or within a word. The results suggest that intra-word and inter-word units should be modeled independently, even when they appear in the same context. Using a set of sub-word units which included variants for intra-word and inter-word, context-dependent phones, an additional decrease of about 10% in word error rate resulted.