Large vocabulary speech recognition in French

Authors:
M. Adda-Decker;G. Adda;J. Gauvain;L. Lamel
Affiliations:
Lab. d'Informatique pour la Mecanique et les Sci. de l'Ingenieur, CNRS, Orsay, France;-;-;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Year:
1999

Citing 0
Cited 1

First approach in the development of multimedia information retrieval resources for the Basque context

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present some design considerations concerning our large vocabulary continuous speech recognition system in French. The impact of the epoch of the text training material on lexical coverage, language model perplexity and recognition performance on newspaper texts is demonstrated. The effectiveness of larger vocabulary sizes and larger text training corpora for language modeling is investigated. French is a highly inflected language producing large lexical variety and a high homophone rate. About 30% of recognition errors are shown to be due to substitutions between inflected forms of a given root form. When word error rates are analysed as a function of word frequency, a significant increase in the error rate can be measured for frequency ranks above 5000.