The LIA speech recognition system: from 10xRT to 1xRT

  • Authors:
  • G. Linarès;P. Nocera;D. Massonié;D. Matrouf

  • Affiliations:
  • Laboratoire Informatique d'Avignon, LIA, Avignon, France;Laboratoire Informatique d'Avignon, LIA, Avignon, France;Laboratoire Informatique d'Avignon, LIA, Avignon, France;Laboratoire Informatique d'Avignon, LIA, Avignon, France

  • Venue:
  • TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The LIA developed a speech recognition toolkit providing most of the components required by speech-to-text systems. This toolbox allowed to build a Broadcast News (BN) transcription system was involved in the ESTER evaluation campaign ([1]), on unconstrained transcription and real-time transcription tasks. In this paper, we describe the techniques we used to reach the real-time, starting from our baseline 10xRT system. We focus on some aspects of the A* search algorithm which are critical for both efficiency and accuracy. Then, we evaluate the impact of the different system components (lexicon, language models and acoustic models) to the trade-off between efficiency and accuracy. Experiments are carried out in framework of the ESTER evaluation campaign. Our results show that the real time system reaches performance on about 5.6% absolute WER whorses than the standard 10xRT system, with an absolute WER (Word Error Rate) of about 26.8%.