An investigation of single-pass ASR system combination for spoken language understanding

  • Authors:
  • Fethi Bougares;Mickael Rouvier;Nathalie Camelin;Paul Deléglise;Yannick Estève

  • Affiliations:
  • LIUM - University of Le Mans, France;LIUM - University of Le Mans, France;LIUM - University of Le Mans, France;LIUM - University of Le Mans, France;LIUM - University of Le Mans, France

  • Venue:
  • SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies the benefits provided by a single-pass Automatic Speech Recognition (ASR) exchange-based combination approach for spoken dialog system. Three famous open-source ASR systems are used to experiment this approach in the framework of Spoken Language Understanding (SLU). On the ASR side, single-pass ASR systems are used with an online acoustic model adaptation using the previous utterances said by a speaker. On the SLU side, a competitive CRF-based SLU system is applied on outputs of ASR system to obtain the semantic concepts. The evaluation is done on the French PORT-MEDIA test data in terms of both Word Error Rate (WER) and Concept Error Rate (CER). While the best single pass system used alone shows a CER of 29.8% for a WER of 22.8%, single-pass ASR exchange-based combination reaches a CER of 27.3% for a WER of 26%. This CER is only slightly higher than the one reached by a 5-passes ASR system which obtained a CER of 26.8% for a WER of 22.8% in better conditions, i.e. better acoustic model adaptation made on all the speech utterances said by a speaker, advanced feature extraction techniques and search graph rescoring using language model with higher order.