Speaker diarization using one-class support vector machines

  • Authors:
  • Belkacem Fergani;Manuel Davy;Amrane Houacine

  • Affiliations:
  • LCPTS, USTHB, Department of Electrical Engineering, BP 32, El Alia Bab Ezzouar, Algiers, Algeria and LAGIS, UMR CNRs 8146 and INRIA-FUTURS SequeL, BP 48, Cité Scientifique, Villeneuve d'Ascq, ...;LAGIS, UMR CNRs 8146 and INRIA-FUTURS SequeL, BP 48, Cité Scientifique, Villeneuve d'Ascq, France;LCPTS, USTHB, Department of Electrical Engineering, BP 32, El Alia Bab Ezzouar, Algiers, Algeria

  • Venue:
  • Speech Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses speaker diarization, which consists of two steps: speaker turn detection and speaker clustering. These two steps require a metric to be defined in order to compare speech segments. Here, we employ a novel metric, based on one-class support vector machines, and recently introduced by one of the authors. This paper presents our speaker diarization primary system based one-class SVM, easy to build and configure. We show through several experiments, using NIST RT'03S and ESTER data sets, that our approach competes most standard approaches based on, e.g., Generalized Likelihood Ratios or Gaussian Mixture Models and may be complementary to them. Moreover, our technique permits the use of any-dimensional heterogeneous acoustic feature vectors, while keeping the computational cost reasonable.