Improving identification accuracy by extending acceptable utterances in spoken dialogue system using barge-in timing

  • Authors:
  • Kyoko Matsuyama;Kazunori Komatani;Toru Takahashi;Tetsuya Ogata;Hiroshi G. Okuno

  • Affiliations:
  • Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan

  • Venue:
  • IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a novel dialogue strategy enabling robust interaction under noisy environments where automatic speech recognition (ASR) results are not necessarily reliable. We have developed a method that exploits utterance timing together with ASR results to interpret user intention, that is, to identify one item that a user wants to indicate from system enumeration. The timing of utterances containing referential expressions is approximated by Gamma distribution, which is integrated with ASR results by expressing both of them as probabilities. In this paper, we improve the identification accuracy by extending the method. First, we enable interpretation of utterances including ordinal numbers, which appear several times in our data collected from users. Then we use proper acoustic models and parameters, improving the identification accuracy by 4.0% in total. We also show that Latent Semantic Mapping (LSM) enables more expressions to be handled in our framework.