Automatic role recognition based on conversational and prosodic behaviour

Authors:
Hugues Salamin;Alessandro Vinciarelli;Khiet Truong;Gelareh Mohammadi
Affiliations:
University of Glasgow, Glasgow, Scotland Uk;University of Glasgow, Glasgow, Scotland Uk;University of Twente, Enschede, Netherlands;Idiap Research Institute, Martigny, Switzerland
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 10
Cited 1

The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Automatic detection of group functional roles in face to face interactions

Proceedings of the 8th international conference on Multimodal interfaces
Using the influence model to recognize functional roles in meetings

Proceedings of the 9th international conference on Multimodal interfaces
Role recognition for meeting participants: an approach based on lexical information and social network analysis

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Initial study on automatic identification of speaker role in broadcast news speech

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Social signal processing: Survey of an emerging domain

Image and Vision Computing
Modeling vocal interaction for text-independent participant characterization in multi-party conversation

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Automatic role recognition in multiparty recordings using social networks and probabilistic sequential models

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Automatic role recognition in multiparty recordings: using social affiliation networks for feature extraction

IEEE Transactions on Multimedia
Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling

IEEE Transactions on Multimedia

Automatic recognition of coordination level in an imitation task

J-HGBU '11 Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.