Gaussian processes for fast policy optimisation of POMDP-based dialogue managers

Authors:
M. Gašić;F. Jurčíček;S. Keizer;F. Mairesse;B. Thomson;K. Yu;S. Young
Affiliations:
Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK;Cambridge University, Cambridge, UK
Venue:
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Year:
2010

Citing 6
Cited 6

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian process dynamic programming

Neurocomputing
The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management

Computer Speech and Language
A heuristic variable grid solution method for POMDPs

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Sample-efficient batch reinforcement learning for dialogue management optimization

ACM Transactions on Speech and Language Processing (TSLP)
Sample efficient on-line learning of optimal dialogue policies with kalman temporal differences

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A comparative study of reinforcement learning techniques on dialogue management

EACL '12 Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics
Towards adaptive dialogue systems for assistive living environments

Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion
Inverse reinforcement learning for interactive systems

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Gaussian Processes for POMDP-Based Dialogue Manager Optimization

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to make the learning process faster and (2) to obtain an estimate of the uncertainty of the approximation. We first demonstrate the idea on a simple voice mail dialogue task and then apply this method to a real-world tourist information dialogue task.