Machine Learning
Simultaneous prediction of dialog acts and address types in three-party conversations
Proceedings of the 9th international conference on Multimodal interfaces
A finite-state turn-taking model for spoken dialog systems
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Multimodal end-of-turn prediction in multi-party meetings
Proceedings of the 2009 international conference on Multimodal interfaces
Importance-Driven Turn-Bidding for spoken dialogue systems
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The CALO meeting assistant system
IEEE Transactions on Audio, Speech, and Language Processing
A multiparty multimodal architecture for realtime turntaking
IVA'10 Proceedings of the 10th international conference on Intelligent virtual agents
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
40years of searching for the best computer system response time
Interacting with Computers
Decisions about turns in multiparty conversation: from perception to action
ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Estimating conversational dominance in multiparty interaction
Proceedings of the 14th ACM international conference on Multimodal interaction
Using group history to identify character-directed utterances in multi-child interactions
SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Hi-index | 0.00 |
Turn-taking decisions in multiparty settings are complex, especially when the participants are children. Our goal is to endow an interactive character with appropriate turn-taking behavior using visual, audio and contextual features. To that end, we investigate three distinct turn-taking models: a baseline model grounded in established turn-taking rules for adults and two machine learning models, one trained with data collected in situ and the other trained with data collected in more controlled conditions. The three models are shown to have different profiles of behavior during silences, overlapping speech, and at the end of participants' turns. An exploratory user evaluation focusing on the decision points where the models differ showed clear preference for the machine learning models over the baseline model. The results indicate that the rules for language interactions with small groups of children are not simply an extension of the rules for interacting with small groups of adults.