Towards modeling user behavior in interactions mediated through an automated bidirectional speech translation system

  • Authors:
  • JongHo Shin;Panayiotis G. Georgiou;Shrikanth Narayanan

  • Affiliations:
  • Viterbi School of Engineering, University of Southern California, 3740 McClintock Av., EEB400, Los Angeles, CA 90089-2564, United States;Viterbi School of Engineering, University of Southern California, 3740 McClintock Av., EEB400, Los Angeles, CA 90089-2564, United States;Viterbi School of Engineering, University of Southern California, 3740 McClintock Av., EEB400, Los Angeles, CA 90089-2564, United States

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper addresses modeling user behavior in interactions between two people who do not share a common spoken language and communicate with the aid of an automated bidirectional speech translation system. These interaction settings are complex. The translation machine attempts to bridge the language gap by mediating the verbal communication, noting however that the technology may not be always perfect. In a step toward understanding user behavior in this mediated communication scenario, usability data from doctor-patient dialogs involving a two way English-Persian speech translation system are analyzed. We specifically consider user behavior in light of potential uncertainty in the communication between the interlocutors. We analyze the Retry (Repeat and Rephrase) versus Accept behaviors in the mediated verbal channel and as a result identify three user types -Accommodating, Normal and Picky, and propose a dynamic Bayesian network model of user behavior. To validate the model, we performed offline and online experiments. The experimental results using offline data show that correct user type is clearly identified as a user keeps his/her consistent behavior in a given interaction condition. In the online experiment, agent feedback was presented to users according to the user types. We show high user satisfaction and interaction efficiency in the analysis of user interview, video data, questionnaire and log data.