Effective metric-based speaker segmentation in the frequency domain

  • Authors:
  • Christoph Boehm;Franz Pernkopf

  • Affiliations:
  • Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an approach, called FREQDIST, for speaker segmentation based on a distance measurement applied in the frequency domain. To enhance the detection performance, the spectrum is reweighted using normalization techniques. Additionally, noise-like (i.e. flat) spectra are removed based on the entropy. Experiments using the TIMIT database [1] and Westdeutscher Rundfunk broadcast data show that our segmentation approach yields a good performance compared to the DISTBIC algorithm [2]. In particular, for the TIMIT data our algorithm reaches a false alarm rate (FAR) less than half of the value of the DISTBIC algorithm and a missed detection rate (MDR) of 7.0% instead of 13.1%.