From prepared speech to spontaneous speech recognition system: a comparative study applied to French language

Authors:
Richard Dufour
Affiliations:
Université du Maine, Le Mans, France
Venue:
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Year:
2008

Citing 3
Cited 0

The kappa statistic: a second look

Computational Linguistics
Using conditional random fields for sentence boundary detection in speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Recognizing disfluencies in conversational speech

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic speech recognition systems (ASR) have more trouble processing spontaneous speech (e.g. debates) than prepared speech (e.g. broadcast news). These difficulties are due to peculiarities of spontaneous speech (false start, repetition, schwa, etc.). In this paper, we highlight some of these peculiarities, especially in French. We show that the use of manual transcriptions having no link with the focused application, but which contains only transcriptions of very spontaneous speech, allows to estimate a better language model, strongly decreasing perplexity and significantly decreasing the word error rate on spontaneous speech. But other knowledge bases used by the ASR have to be adapted. For example, our work shows that adding specific pronunciation variants seems useful, but has to be constrained and modelized. Finally, we compare errors of our CMU Sphinx-based ASR system on spontaneous vs. prepared speech.