A tractable hybrid ddn–pomdp approach to affective dialogue modeling for probabilistic frame-based dialogue systems

  • Authors:
  • Trung h. Bui;Mannes Poel;Anton Nijholt;Job Zwiers

  • Affiliations:
  • Human media interaction group, department of computer science, university of twente, postbus 217, 7500 ae, enschede, the netherlands e-mail: thbui@stanford.edu, m.poel@ewi.utwente.nl, a.nijholt@ew ...;Human media interaction group, department of computer science, university of twente, postbus 217, 7500 ae, enschede, the netherlands e-mail: thbui@stanford.edu, m.poel@ewi.utwente.nl, a.nijholt@ew ...;Human media interaction group, department of computer science, university of twente, postbus 217, 7500 ae, enschede, the netherlands e-mail: thbui@stanford.edu, m.poel@ewi.utwente.nl, a.nijholt@ew ...;Human media interaction group, department of computer science, university of twente, postbus 217, 7500 ae, enschede, the netherlands e-mail: thbui@stanford.edu, m.poel@ewi.utwente.nl, a.nijholt@ew ...

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel approach to developing a tractable affective dialogue model for probabilistic frame-based dialogue systems. The affective dialogue model, based on Partially Observable Markov Decision Process (POMDP) and Dynamic Decision Network (DDN) techniques, is composed of two main parts: the slot-level dialogue manager and the global dialogue manager. It has two new features: (1) being able to deal with a large number of slots and (2) being able to take into account some aspects of the user's affective state in deriving the adaptive dialogue strategies. Our implemented prototype dialogue manager can handle hundreds of slots, where each individual slot might have hundreds of values. Our approach is illustrated through a route navigation example in the crisis management domain. We conducted various experiments to evaluate our approach and to compare it with approximate POMDP techniques and handcrafted policies. The experimental results showed that the DDN–POMDP policy outperforms three handcrafted policies when the user's action error is induced by stress as well as when the observation error increases. Further, performance of the one-step look-ahead DDN–POMDP policy after optimizing its internal reward is close to state-of-the-art approximate POMDP counterparts.