Benchmark tests for the DARPA Spoken Language Program
HLT '93 Proceedings of the workshop on Human Language Technology
Efficient cepstral normalization for robust speech recognition
HLT '93 Proceedings of the workshop on Human Language Technology
The forward-backward search algorithm
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
New uses for the N-best sentence hypotheses within the BYBLOS speech recognition system
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
The estimation of powerful language models from small and large corpora
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Benchmark tests for the DARPA Spoken Language Program
HLT '93 Proceedings of the workshop on Human Language Technology
Efficient cepstral normalization for robust speech recognition
HLT '93 Proceedings of the workshop on Human Language Technology
Adaptation to new microphones using tied-mixture normalization
HLT '94 Proceedings of the workshop on Human Language Technology
Signal processing for robust speech recognition
HLT '94 Proceedings of the workshop on Human Language Technology
Hi-index | 0.00 |
This paper describes several key experiments in large vocabulary speech recognition. We demonstrate that, counter to our intuitions, given a fixed amount of training speech, the number of training speakers has little effect on the accuracy. We show how much speech is needed for speaker-independent (SI) recognition in order to achieve the same performance as speaker-dependent (SD) recognition. We demonstrate that, though the N-Best Paradigm works quite well up to vocabularies of 5,000 words, it begins to break down with 20,000 words and long sentences. We compare the performance of two feature preprocessing algorithms for microphone independence and we describe a new microphone adaptation algorithm based on selection among several codebook transformations.