Context dependent modeling of phones in continuous speech using decision trees
HLT '91 Proceedings of the workshop on Speech and Natural Language
Benchmark tests for the DARPA Spoken Language Program
HLT '93 Proceedings of the workshop on Human Language Technology
On the use of tied-mixture distributions
HLT '93 Proceedings of the workshop on Human Language Technology
Techniques to achieve an accurate real-time large-vocabulary speech recognition system
HLT '94 Proceedings of the workshop on Human Language Technology
Hi-index | 0.00 |
Improved acoustic modeling can significantly decrease the error rate in large-vocabulary speech recognition. Our approach to the problem is twofold. We first propose a scheme that optimizes the degree of mixture tying for a given amount of training data and computational resources. Experimental results on the Wall Street Journal (WSJ) Corpus show that this new form of output distribution achieves a 25% reduction in error rate over typical tied-mixture systems. We then show that an additional improvement can be achieved by modeling local time correlation with linear discriminant features.