The 2006 athens information technology speech activity detection and speaker diarization systems

  • Authors:
  • Elias Rentzeperis;Andreas Stergiou;Christos Boukis;Aristodemos Pnevmatikakis;Lazaros C. Polymenakos

  • Affiliations:
  • Autonomic & Grid Computing Group, Athens Information Technology, Athens, Greece;Autonomic & Grid Computing Group, Athens Information Technology, Athens, Greece;Autonomic & Grid Computing Group, Athens Information Technology, Athens, Greece;Autonomic & Grid Computing Group, Athens Information Technology, Athens, Greece;Autonomic & Grid Computing Group, Athens Information Technology, Athens, Greece

  • Venue:
  • MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) systems that were developed by the Athens Information Technology in the scope of the NIST RT-06S evaluations. The SAD system performs classification of recorded frames into speech and non-speech, using Linear Discriminant Analysis (LDA), while the SPKR one initially segments recordings into speech intervals based on the Bayesian Information Criterion (BIC), and then applies a two-step clustering strategy to group segments from the same speaker together. Following a discussion of the intrinsics of the two systems, we report and comment on our results on the RT-06S corpus [20].