Automatic role recognition based on conversational and prosodic behaviour

  • Authors:
  • Hugues Salamin;Alessandro Vinciarelli;Khiet Truong;Gelareh Mohammadi

  • Affiliations:
  • University of Glasgow, Glasgow, Scotland Uk;University of Glasgow, Glasgow, Scotland Uk;University of Twente, Enschede, Netherlands;Idiap Research Institute, Martigny, Switzerland

  • Venue:
  • Proceedings of the international conference on Multimedia
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.