Japanese speech understanding using grammar specialization

  • Authors:
  • Manny Rayner;Nikos Chatzichrisafis;Pierrette Bouillon;Yukie Nakao;Hitoshi Isahara;Kyoko Kanzaki;Beth Ann Hockey;Marianne Santaholma;Marianne Starlander

  • Affiliations:
  • University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;UCSC/NASA Ames Research Center, Moffet Field, CA;University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland

  • Venue:
  • HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The most common speech understanding architecture for spoken dialogue systems is a combination of speech recognition based on a class N-gram language model, and robust parsing. For many types of applications, however, grammar-based recognition can offer concrete advantages. Training a good class N-gram language model requires substantial quantities of corpus data, which is generally not available at the start of a new project. Head-to-head comparisons of class N-gram/robust and grammar-based systems also suggest that users who are familiar with system coverage get better results from grammar-based architectures (Knight et al., 2001). As a consequence, deployed spoken dialogue systems for real-world applications frequently use grammar-based methods. This is particularly the case for speech translation systems. Although leading research systems like Verbmobil and NE-SPOLE! (Wahlster, 2000; Lavie et al., 2001) usually employ complex architectures combining statistical and rule-based methods, successful practical examples like Phraselator and S-MINDS (Phraselator, 2005; Sehda, 2005) are typically phrasal translators with grammar-based recognizers.