From prepared speech to spontaneous speech recognition system: a comparative study applied to French language

  • Authors:
  • Richard Dufour

  • Affiliations:
  • Université du Maine, Le Mans, France

  • Venue:
  • CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic speech recognition systems (ASR) have more trouble processing spontaneous speech (e.g. debates) than prepared speech (e.g. broadcast news). These difficulties are due to peculiarities of spontaneous speech (false start, repetition, schwa, etc.). In this paper, we highlight some of these peculiarities, especially in French. We show that the use of manual transcriptions having no link with the focused application, but which contains only transcriptions of very spontaneous speech, allows to estimate a better language model, strongly decreasing perplexity and significantly decreasing the word error rate on spontaneous speech. But other knowledge bases used by the ASR have to be adapted. For example, our work shows that adding specific pronunciation variants seems useful, but has to be constrained and modelized. Finally, we compare errors of our CMU Sphinx-based ASR system on spontaneous vs. prepared speech.