Japanese speech understanding using grammar specialization

Authors:
Manny Rayner;Nikos Chatzichrisafis;Pierrette Bouillon;Yukie Nakao;Hitoshi Isahara;Kyoko Kanzaki;Beth Ann Hockey;Marianne Santaholma;Marianne Starlander
Affiliations:
University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Seika-cho, Soraku-gun, Kyoto, Japan;UCSC/NASA Ames Research Center, Moffet Field, CA;University of Geneva, TIM/ISSCO, Geneva, Switzerland;University of Geneva, TIM/ISSCO, Geneva, Switzerland
Venue:
HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Year:
2005

Citing 2
Cited 1

An open source environment for compiling typed unification grammars into speech recognisers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Architecture and design considerations in NESPOLE!: a speech translation system for e-commerce applications

HLT '01 Proceedings of the first international conference on Human language technology research

Evaluating task performance for a unidirectional controlled language medical speech translation system

MST '06 Proceedings of the Workshop on Medical Speech Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The most common speech understanding architecture for spoken dialogue systems is a combination of speech recognition based on a class N-gram language model, and robust parsing. For many types of applications, however, grammar-based recognition can offer concrete advantages. Training a good class N-gram language model requires substantial quantities of corpus data, which is generally not available at the start of a new project. Head-to-head comparisons of class N-gram/robust and grammar-based systems also suggest that users who are familiar with system coverage get better results from grammar-based architectures (Knight et al., 2001). As a consequence, deployed spoken dialogue systems for real-world applications frequently use grammar-based methods. This is particularly the case for speech translation systems. Although leading research systems like Verbmobil and NE-SPOLE! (Wahlster, 2000; Lavie et al., 2001) usually employ complex architectures combining statistical and rule-based methods, successful practical examples like Phraselator and S-MINDS (Phraselator, 2005; Sehda, 2005) are typically phrasal translators with grammar-based recognizers.