Spoken language understanding using weakly supervised learning

Authors:
Wei-Lin Wu;Ru-Zhan Lu;Jian-Yong Duan;Hui Liu;Feng Gao;Yu-Quan Chen
Affiliations:
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China
Venue:
Computer Speech and Language
Year:
2010

Citing 30
Cited 1

Evaluation of spoken language systems: the ATIS domain

HLT '90 Proceedings of the workshop on Speech and Natural Language
TINA: a natural language system for spoken language applications

Computational Linguistics
Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
The LIMSI RailTel system: field trial of a telephone service for rail travel information

Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
How may I help you?

Speech Communication - Special issue on interactive voice technology for telecommunication applications (IVITA '96)
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Semiautomatic Acquisition of Semantic Structures for Understanding Domain-Specific Natural Language Queries

IEEE Transactions on Knowledge and Data Engineering
The Application of Semantic Classification Trees to Natural Language Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Decision Lists

Machine Learning
Active + Semi-supervised Learning = Robust Multi-View Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Inducing Probabilistic Grammars by Bayesian Model Merging

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Support Vector Machine Active Learning with Application sto Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Fertility models for statistical natural language understanding

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
PARADISE: a framework for evaluating spoken dialogue agents

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Modeling with structures in statistical machine translation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Gemini: a natural language system for spoken-language understanding

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Hidden understanding models of natural language

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Active learning for statistical natural language parsing

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Bootstrapping

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Applying co-training methods to statistical parsing

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Recent improvements in the CMU spoken language understanding system

HLT '94 Proceedings of the workshop on Human Language Technology
Bootstrapping POS taggers using unlabelled data

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Combining statistical and knowledge-based spoken language understanding in conditional models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A weakly supervised learning approach for spoken language understanding

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
The AT&T spoken language understanding system

IEEE Transactions on Audio, Speech, and Language Processing

A domain-independent statistical methodology for dialog management in spoken dialog systems

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a two-stage classification problem. Firstly, the topic classifier is used to identify the topic of an input utterance. Secondly, with the restriction of the recognized target topic, the slot classifiers are trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. More importantly, it allows that weakly supervised strategies are employed for training the two kinds of classifiers, which could significantly reduce the number of labeled sentences. We investigated active learning and naive self-training for the two kinds of classifiers. Also, we propose a practical method for bootstrapping topic-dependent slot classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The experimental results show the effectiveness of our proposed SLU framework and demonstrate the possibility to reduce human labeling efforts significantly.