Improving speech processing trough social signals: automatic speaker segmentation of political debates using role based turn-taking patterns

  • Authors:
  • Fabio Valente;Alessandro Vinciarelli

  • Affiliations:
  • Idiap Research Institute, Martigny, Switzerland;University of Glasgow, Glasgow, United Kingdom

  • Venue:
  • Proceedings of the 2nd international workshop on Social signal processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several recent works on social signals have addressed the problem of statistical modeling of social interaction in multi-party discussions showing that characteristics like turn-taking patterns can be modeled and predicted according to the role that each participant has in the discussion. Reversely this work investigates the use of social signals to improve conventional speech processing methods. In details we propose the use of turn-taking patterns induced by roles for improving speaker diarization, the task of determining 'Who spoke when' in an audio file. In detail, this work studies how to include this information as statistical prior on the speaker interactions for segmenting and clustering speakers in multi-party political debates. Experiments reveal that the proposed approach reduces the speaker error over the baseline by 25% when both the number of speakers and their roles are known and by 13% relative when the pattern information is estimated from the data. Furthermore we never verify a performance degradation in any recording. Experiments are also carried out to investigate the contribution of the first-order Markov assumption i.e. that the role of the speaker n is conditionally dependent on the role of the speaker n-1.