Speaker diarization using one-class support vector machines

Authors:
Belkacem Fergani;Manuel Davy;Amrane Houacine
Affiliations:
LCPTS, USTHB, Department of Electrical Engineering, BP 32, El Alia Bab Ezzouar, Algiers, Algeria and LAGIS, UMR CNRs 8146 and INRIA-FUTURS SequeL, BP 48, Cité Scientifique, Villeneuve d'Ascq, ...;LAGIS, UMR CNRs 8146 and INRIA-FUTURS SequeL, BP 48, Cité Scientifique, Villeneuve d'Ascq, France;LCPTS, USTHB, Department of Electrical Engineering, BP 32, El Alia Bab Ezzouar, Algiers, Algeria
Venue:
Speech Communication
Year:
2008

Citing 7
Cited 2

DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
An online support vector machine for abnormal events detection

Signal Processing - Special section: Advances in signal processing-assisted cross-layer designs
Speaker identification via support vector classifiers

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Evolutive HMM for multi-speaker tracking system

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
An online kernel change detection algorithm

IEEE Transactions on Signal Processing - Part II
Multistage speaker diarization of broadcast news

IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems

IEEE Transactions on Audio, Speech, and Language Processing

Speaker diarization using autoassociative neural networks

Engineering Applications of Artificial Intelligence
Unfolding speaker clustering potential: a biomimetic approach

MM '09 Proceedings of the 17th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses speaker diarization, which consists of two steps: speaker turn detection and speaker clustering. These two steps require a metric to be defined in order to compare speech segments. Here, we employ a novel metric, based on one-class support vector machines, and recently introduced by one of the authors. This paper presents our speaker diarization primary system based one-class SVM, easy to build and configure. We show through several experiments, using NIST RT'03S and ESTER data sets, that our approach competes most standard approaches based on, e.g., Generalized Likelihood Ratios or Gaussian Mixture Models and may be complementary to them. Moreover, our technique permits the use of any-dimensional heterogeneous acoustic feature vectors, while keeping the computational cost reasonable.