Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages

Authors:
S. Hahn;M. Dinarelli;C. Raymond;F. Lefevre;P. Lehnen;R. De Mori;A. Moschitti;H. Ney;G. Riccardi
Affiliations:
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany;-;-;-;-;-;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2011

Citing 0
Cited 8

Unsupervised concept annotation using latent Dirichlet allocation and segmental methods

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised alignment for segmental-based language understanding

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Hypotheses selection criteria in a reranking framework for spoken language understanding

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Tree representations in probabilistic models for extended named entities detection

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
NLify: lightweight spoken natural language interfaces via exhaustive paraphrasing

Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing
An investigation of single-pass ASR system combination for spoken language understanding

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Generalization of discriminative approaches for speech language understanding in a multilingual context

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Situated incremental natural language understanding using Markov Logic Networks

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the first steps in building a spoken language understanding (SLU) module for dialogue systems is the extraction of flat concepts out of a given word sequence, usually provided by an automatic speech recognition (ASR) system. In this paper, six different modeling approaches are investigated to tackle the task of concept tagging. These methods include classical, well-known generative and discriminative methods like Finite State Transducers (FSTs), Statistical Machine Translation (SMT), Maximum Entropy Markov Models (MEMMs), or Support Vector Machines (SVMs) as well as techniques recently applied to natural language processing such as Conditional Random Fields (CRFs) or Dynamic Bayesian Networks (DBNs). Following a detailed description of the models, experimental and comparative results are presented on three corpora in different languages and with different complexity. The French MEDIA corpus has already been exploited during an evaluation campaign and so a direct comparison with existing benchmarks is possible. Recently collected Italian and Polish corpora are used to test the robustness and portability of the modeling approaches. For all tasks, manual transcriptions as well as ASR inputs are considered. Additionally to single systems, methods for system combination are investigated. The best performing model on all tasks is based on conditional random fields. On the MEDIA evaluation corpus, a concept error rate of 12.6% could be achieved. Here, additionally to attribute names, attribute values have been extracted using a combination of a rule-based and a statistical approach. Applying system combination using weighted ROVER with all six systems, the concept error rate (CER) drops to 12.0%.