A rapid match algorithm for continuous speech recognition
HLT '90 Proceedings of the workshop on Speech and Natural Language
Bayesian learning for hidden Markov model with Gaussian mixture state observation densities
Speech Communication - Eurospeech '91
Speaker-independent phone recognition using BREF
HLT '91 Proceedings of the workshop on Speech and Natural Language
Benchmark tests for the DARPA Spoken Language Program
HLT '93 Proceedings of the workshop on Human Language Technology
1993 benchmark tests for the ARPA spoken language program
HLT '94 Proceedings of the workshop on Human Language Technology
New uses for the N-best sentence hypotheses within the BYBLOS speech recognition system
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
A fast match for continuous speech recognition using allophonic models
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
1993 benchmark tests for the ARPA spoken language program
HLT '94 Proceedings of the workshop on Human Language Technology
Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition
Computer Speech and Language
Hi-index | 0.00 |
A major axis of research at LIMSI is directed at multilingual, speaker-independent, large vocabulary speech dictation. In this paper the LIMSI recognizer which was evaluated in the ARPA NOV93 CSR test is described, and experimental results on the WSJ and BREF corpora under closely matched conditions are reported. For both corpora word recognition experiments were carried out with vocabularies containing up to 20k words. The recognizer makes use of continuous density HMM with Gaussian mixture for acoustic modeling and n-gram statistics estimated on the newspaper texts for language modeling. The recognizer uses a time-synchronous graph-search strategy which is shown to still be viable with a 20k-word vocabulary when used with bigram back-off language models. A second forward pass, which makes use of a word graph generated with the bigram, incorporates a trigram language model. Acoustic modeling uses cepstrum-based features, context-dependent phone models (intra and interword), phone duration models, and sex-dependent models.