Rapid bootstrapping of statistical spoken dialogue systems

  • Authors:
  • Ruhi Sarikaya

  • Affiliations:
  • IBM T.J. Watson Research Center, 1101 Kitchawan Road/Route 134, Yorktown Heights, NY 10598, United States

  • Venue:
  • Speech Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rapid deployment of statistical spoken dialogue systems poses portability challenges for building new applications. We discuss the challenges that arise and focus on two main problems: (i) fast semantic annotation for statistical speech understanding and (ii) reliable and efficient statistical language modeling using limited in-domain resources. We address the first problem by presenting a new bootstrapping framework that uses a majority-voting based combination of three methods for the semantic annotation of a ''mini-corpus'' that is usually manually annotated. The three methods are a statistical decision tree based parser, a similarity measure and a support vector machine classifier. The bootstrapping framework results in an overall cost reduction of about a factor of two in the annotation effort compared to the baseline method. We address the second problem by devising a method to efficiently build reliable statistical language models for new spoken dialog systems, given limited in-domain data. This method exploits external text resources that are collected for other speech recognition tasks as well as dynamic text resources acquired from the World Wide Web. The proposed method is applied to a spoken dialog system in a financial transaction domain and a natural language call-routing task in a package shipment domain. The experiments demonstrate that language models built using external resources, when used jointly with the limited in-domain language model, result in relative word error rate reductions of 9-18%. Alternatively, the proposed method can be used to produce a 3-to-10 fold reduction for the in-domain data requirement to achieve a given performance level.