Sequential classifiers for the prediction of user judgments about spoken dialog systems

Authors:
Klaus-Peter Engelbrecht;Sebastian Möller
Affiliations:
Quality and Usability Lab, Deutsche Telekom Laboratories, TU Berlin, Ernst-Reuter-Platz 7, D-10587 Berlin, Germany;Quality and Usability Lab, Deutsche Telekom Laboratories, TU Berlin, Ernst-Reuter-Platz 7, D-10587 Berlin, Germany
Venue:
Speech Communication
Year:
2010

Citing 9
Cited 0

Towards developing general models of usability with PARADISE

Natural Language Engineering
PARADISE: a framework for evaluating spoken dialogue agents

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Speech Quality of VoIP: Assessment and Prediction

Speech Quality of VoIP: Assessment and Prediction
Predicting the quality and usability of spoken dialogue services

Speech Communication
Detecting Problematic Dialogs with Automated Agents

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Analysis of a new simulation approach to dialog system evaluation

Speech Communication
User simulation as testing for spoken dialog systems

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Modeling user satisfaction with Hidden Markov Model

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Quality of Telephone-Based Spoken Dialogue Systems

Quality of Telephone-Based Spoken Dialogue Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

So far, predictions of user quality judgments in response to spoken dialog systems have been achieved on the basis of interaction parameters describing the dialog, e.g. in the PARADISE framework. These parameters do not take into account the temporal position of events happening in the dialog. It seems promising to apply sequence classification algorithms to the raw annotations of the data, instead of interaction parameters describing the overall dialog. As dialogs can be of very different length, Hidden Markov Models (HMM) and Markov Chains (MC) are handy, because they describe the likelihood of traversing to a state given only the previous state and the transition probability, thus they can be trained and applied to sequences of different lengths. This paper analyzes the feasibility of predicting user judgments with HMMs and MCs. In order to test the models, we acquire data with different types of users, forcing users to do as similar interactions as possible, and asking for user judgments after each turn. This allows comparing predicted distributions of judgments to the distributions measured empirically. We also apply the models to less rich corpora and compare them with results from Linear Regression models as used in the PARADISE framework.