Effective metric-based speaker segmentation in the frequency domain

Authors:
Christoph Boehm;Franz Pernkopf
Affiliations:
Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 2

A review on speaker diarization systems and approaches

Speech Communication
A unified framework for domain independent online speaker indexing in eigen-voice space using an index tree of reference models

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present an approach, called FREQDIST, for speaker segmentation based on a distance measurement applied in the frequency domain. To enhance the detection performance, the spectrum is reweighted using normalization techniques. Additionally, noise-like (i.e. flat) spectra are removed based on the entropy. Experiments using the TIMIT database [1] and Westdeutscher Rundfunk broadcast data show that our segmentation approach yields a good performance compared to the DISTBIC algorithm [2]. In particular, for the TIMIT data our algorithm reaches a false alarm rate (FAR) less than half of the value of the DISTBIC algorithm and a missed detection rate (MDR) of 7.0% instead of 13.1%.