The MIT SUMMIT Speech Recognition system: a progress report
HLT '89 Proceedings of the workshop on Speech and Natural Language
HLT '90 Proceedings of the workshop on Speech and Natural Language
Automatic Speech Recognition: The Development of the Sphinx Recognition System
Automatic Speech Recognition: The Development of the Sphinx Recognition System
A study on speaker adaptation of the parameters of continuousdensity hidden Markov models
IEEE Transactions on Signal Processing
Bayesian learning of Gaussian mixture densities for hidden Markov models
HLT '91 Proceedings of the workshop on Speech and Natural Language
A study on speaker-adaptive speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
DARPA resource management benchmark test results June 1990
HLT '90 Proceedings of the workshop on Speech and Natural Language
HLT '90 Proceedings of the workshop on Speech and Natural Language
Factorization of language constraints in speech recognition
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Subphonetic modeling for speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
MAP estimation of continuous density HMM: theory and applications
HLT '91 Proceedings of the workshop on Speech and Natural Language
Subphonetic modeling with Markov states: senone
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Improved acoustic modeling with Bayesian learning
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Discriminative analysis for feature reduction in automatic speech recognition
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Predicting unseen triphones with senones
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Hi-index | 0.00 |
We report on some recent improvements to an HMM-based, continuous speech recognition system which is being developed at AT&T Bell Laboratories. These advances, which include the incorporation of inter-word, context-dependent units and an improved feature analysis, lead to a recognition system which achieve better than 95% word accuracy for speaker independent recognition of the 1000-word, DARPA resource management task using the standard word-pair grammar (with a perplexity of about 60). It will be shown that the incorporation of inter-word units into training results in better acoustic models of word juncture coarticulation and gives a 20% reduction in error rate. The effect of an improved set of spectral and log energy features is to further reduce word error rate by about 30%. We also found that the spectral vectors, corresponding to the same speech unit, behave differently statistically, depending on whether they are at word boundaries or within a word. The results suggest that intra-word and inter-word units should be modeled independently, even when they appear in the same context. Using a set of sub-word units which included variants for intra-word and inter-word, context-dependent phones, an additional decrease of about 10% in word error rate resulted.