Robust speaker segmentation for meetings: the ICSI-SRI spring 2005 diarization system
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Multistage speaker diarization of broadcast news
IEEE Transactions on Audio, Speech, and Language Processing
The LIMSI RT07 Lecture Transcription System
Multimodal Technologies for Perception of Humans
Progress in the AMIDA Speaker Diarization System for Meeting Data
Multimodal Technologies for Perception of Humans
The IBM RT07 Evaluation Systems for Speaker Diarization on Lecture Meetings
Multimodal Technologies for Perception of Humans
Speech Processing for Audio Indexing
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
A speaker diarization method based on the probabilistic fusion of audio-visual location information
Proceedings of the 2009 international conference on Multimodal interfaces
IEEE Transactions on Audio, Speech, and Language Processing
A review on speaker diarization systems and approaches
Speech Communication
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
This paper presents the LIMSI speaker diarization system for lecture data, in the framework of the Rich Transcription 2006 Spring (RT-06S) meeting recognition evaluation. This system builds upon the baseline diarization system designed for broadcast news data. The baseline system combines agglomerative clustering based on Bayesian information criterion with a second clustering using state-of-the-art speaker identification techniques. In the RT-04F evaluation, the baseline system provided an overall diarization error of 8.5% on broadcast news data. However since it has a high missed speech error rate on lecture data, a different speech activity detection approach based on the log-likelihood ratio between the speech and non-speech models trained on the seminar data was explored. The new speaker diarization system integrating this module provides an overall diarization error of 20.2% on the RT-06S Multiple Distant Microphone (MDM) data.