The AMI speaker diarization system for NIST RT06s meeting data

  • Authors:
  • David A. van Leeuwen;Marijn Huijbregts

  • Affiliations:
  • TNO Human Factors, Soesterberg, The Netherlands;Department of EEMCS, Human Media Interaction, University of Twente, Enschede, The Netherlands

  • Venue:
  • MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are based on the TNO and ICSI system submitted for RT05s. For the conference room evaluation Single Distant Microphone condition, the SAD results perform well at 4.23 % error rate, and the ‘HMM-BIC' SPKR results perform competatively at an error rate of 37.2 % including overlapping speech.