The LIMSI continuous speech dictation system

Authors:
J. L. Gauvain;L. F. Lamel;G. Adda;M. Adda-Decker
Affiliations:
LIMSI-CNRS, France;LIMSI-CNRS, France;LIMSI-CNRS, France;LIMSI-CNRS, France
Venue:
HLT '94 Proceedings of the workshop on Human Language Technology
Year:
1994

Citing 8
Cited 2

A rapid match algorithm for continuous speech recognition

HLT '90 Proceedings of the workshop on Speech and Natural Language
Bayesian learning for hidden Markov model with Gaussian mixture state observation densities

Speech Communication - Eurospeech '91
Speaker-independent phone recognition using BREF

HLT '91 Proceedings of the workshop on Speech and Natural Language
Benchmark tests for the DARPA Spoken Language Program

HLT '93 Proceedings of the workshop on Human Language Technology
1993 benchmark tests for the ARPA spoken language program

HLT '94 Proceedings of the workshop on Human Language Technology
New uses for the N-best sentence hypotheses within the BYBLOS speech recognition system

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
A fast match for continuous speech recognition using allophonic models

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Large-vocabulary dictation using SRI's DECIPHERTM speech recognition system: progressive search techniques

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

1993 benchmark tests for the ARPA spoken language program

HLT '94 Proceedings of the workshop on Human Language Technology
Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

A major axis of research at LIMSI is directed at multilingual, speaker-independent, large vocabulary speech dictation. In this paper the LIMSI recognizer which was evaluated in the ARPA NOV93 CSR test is described, and experimental results on the WSJ and BREF corpora under closely matched conditions are reported. For both corpora word recognition experiments were carried out with vocabularies containing up to 20k words. The recognizer makes use of continuous density HMM with Gaussian mixture for acoustic modeling and n-gram statistics estimated on the newspaper texts for language modeling. The recognizer uses a time-synchronous graph-search strategy which is shown to still be viable with a 20k-word vocabulary when used with bigram back-off language models. A second forward pass, which makes use of a word graph generated with the bigram, incorporates a trigram language model. Acoustic modeling uses cepstrum-based features, context-dependent phone models (intra and interword), phone duration models, and sex-dependent models.