DISTBIC: a speaker-based segmentation for audio data indexing
Speech Communication - Special issue on accessing information in spoken audio
Audio Partitioning and Transcription for Broadcast Data Indexation
Multimedia Tools and Applications
Evolutive HMM for multi-speaker tracking system
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Multimodal Technologies for Perception of Humans
The LIA RT'07 Speaker Diarization System
Multimodal Technologies for Perception of Humans
Robust speech/non-speech classification in heterogeneous multimedia content
Speech Communication
The TNO speaker diarization system for NIST RT05s meeting data
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Robust speaker diarization for meetings: ICSI RT06S meetings evaluation system
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Technical improvements of the E-HMM based speaker diarization system for meeting records
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Speaker diarization: from broadcast news to lectures
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
A review on speaker diarization systems and approaches
Speech Communication
Hi-index | 0.00 |
This paper presents different pre-processing techniques, coupled with three speaker diarization systems in the framework of the NIST 2005 Spring Rich Transcription campaign (RT'05S). The pre-processing techniques aim at providing a signal quality index in order to build a unique “virtual” signal obtained from all the microphone recordings available for a meeting. This unique virtual signal relies on a weighted sum of the different microphone signals while the signal quality index is given according to a signal to noise ratio. Two methods are used in this paper to compute the instantaneous signal to noise ratio: a speech activity detection based approach and a noise spectrum estimate. The speaker diarization task is performed using systems developed by different labs: the LIA, LIUM and CLIPS. Among the different system submissions made by these three labs, the best system obtained 24.5 % speaker diarization error for the conference subdomain and 18.4 % for the lecture subdomain.