k-nearest neighbor Monte-Carlo control algorithm for POMDP-based dialogue systems

  • Authors:
  • F. Lefévre;M. Gašić;F. Jurčíček;S. Keizer;F. Mairesse;B. Thomson;K. Yu;S. Young

  • Affiliations:
  • Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK

  • Venue:
  • SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In real-world applications, modelling dialogue as a POMDP requires the use of a summary space for the dialogue state representation to ensure tractability. Sub-optimal estimation of the value function governing the selection of system responses can then be obtained using a grid-based approach on the belief space. In this work, the Monte-Carlo control technique is extended so as to reduce training over-fitting and to improve robustness to semantic noise in the user input. This technique uses a database of belief vector prototypes to choose the optimal system action. A locally weighted k-nearest neighbor scheme is introduced to smooth the decision process by interpolating the value function, resulting in higher user simulation performance.