Towards incremental speech generation in conversational systems

Authors:
Gabriel Skantze;Anna Hjalmarsson
Affiliations:
Department of Speech Music and Hearing, KTH, Sweden;Department of Speech Music and Hearing, KTH, Sweden
Venue:
Computer Speech and Language
Year:
2013

Citing 10
Cited 0

An architecture for more realistic conversational systems

Proceedings of the 6th international conference on Intelligent user interfaces
Incremental sentence generation: implications for the structure of a syntactic processor

COLING '82 Proceedings of the 9th conference on Computational linguistics - Volume 1
EXPROS: A Toolkit for Exploratory Experimentation with Prosody in Customized Diphone Voices

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Embodied conversational agents in computer assisted language learning

Speech Communication
Incremental dialogue processing in a micro-domain

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Optimizing endpointing thresholds using dialogue features in a spoken dialogue system

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Speaking without knowing what to say…or when to end

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Can I finish?: learning when to respond to incremental interpretation results in interactive dialogue

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Turn-yielding cues in task-oriented dialogue

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Comparing local and sequential models for statistical incremental natural language understanding

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a model of incremental speech generation in practical conversational systems. The model allows a conversational system to incrementally interpret spoken input, while simultaneously planning, realising and self-monitoring the system response. If these processes are time consuming and result in a response delay, the system can automatically produce hesitations to retain the floor. While speaking, the system utilises hidden and overt self-corrections to accommodate revisions in the system. The model has been implemented in a general dialogue system framework. Using this framework, we have implemented a conversational game application. A Wizard-of-Oz experiment is presented, where the automatic speech recognizer is replaced by a Wizard who transcribes the spoken input. In this setting, the incremental model allows the system to start speaking while the user's utterance is being transcribed. In comparison to a non-incremental version of the same system, the incremental version has a shorter response time and is perceived as more efficient by the users.