A Hybrid Generative-Discriminative Approach to Speaker Diarization

  • Authors:
  • Athanasios K. Noulas;Tim Kasteren;Ben J. Kröse

  • Affiliations:
  • University of Amsterdam, Amsterdam, The Netherlands 1098 SJ;University of Amsterdam, Amsterdam, The Netherlands 1098 SJ;University of Amsterdam, Amsterdam, The Netherlands 1098 SJ

  • Venue:
  • MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a sound probabilistic approach to speaker diarization. We use a hybrid framework where a distribution over the number of speakers at each point of a multimodal stream is estimated with a discriminative model. The output of this process is used as input in a generative model that can adapt to a novel test set and perform high accuracy speaker diarization. We manage to deal efficiently with the less common, and therefore harder, segments like silence and multiple speaker parts in a principled probabilistic manner.