Spoken language understanding using weakly supervised learning

  • Authors:
  • Wei-Lin Wu;Ru-Zhan Lu;Jian-Yong Duan;Hui Liu;Feng Gao;Yu-Quan Chen

  • Affiliations:
  • Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, PR China

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a two-stage classification problem. Firstly, the topic classifier is used to identify the topic of an input utterance. Secondly, with the restriction of the recognized target topic, the slot classifiers are trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. More importantly, it allows that weakly supervised strategies are employed for training the two kinds of classifiers, which could significantly reduce the number of labeled sentences. We investigated active learning and naive self-training for the two kinds of classifiers. Also, we propose a practical method for bootstrapping topic-dependent slot classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The experimental results show the effectiveness of our proposed SLU framework and demonstrate the possibility to reduce human labeling efforts significantly.