Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
3D audiovisual person tracking using Kalman filtering and information theory
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
A decision fusion system across time and classifiers for audio-visual person identification
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Robust speech/non-speech classification in heterogeneous multimedia content
Speech Communication
Hi-index | 0.00 |
This paper describes the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) systems that were developed by the Athens Information Technology in the scope of the NIST RT-06S evaluations. The SAD system performs classification of recorded frames into speech and non-speech, using Linear Discriminant Analysis (LDA), while the SPKR one initially segments recordings into speech intervals based on the Bayesian Information Criterion (BIC), and then applies a two-step clustering strategy to group segments from the same speaker together. Following a discussion of the intrinsics of the two systems, we report and comment on our results on the RT-06S corpus [20].