Evaluations of Automatic Speaker Classification Systems
Speaker Classification I
The Rich Transcription 2007 Meeting Recognition Evaluation
Multimodal Technologies for Perception of Humans
Progress in the AMIDA Speaker Diarization System for Meeting Data
Multimodal Technologies for Perception of Humans
Evaluating the future of HCI: challenges for the evaluation of emerging applications
ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
The TNO speaker diarization system for NIST RT05s meeting data
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Speaker identification and speech recognition using phased arrays
Ambient Intelligence in Everyday Life
Robust speech activity detection in interactive smart-room environments
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
The rich transcription 2006 spring meeting recognition evaluation
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
A review on speaker diarization systems and approaches
Speech Communication
Hi-index | 0.00 |
This paper presents the design and results of the Rich Transcription Spring 2005 (RT-05S) Meeting Recognition Evaluation. This evaluation is the third in a series of community-wide evaluations of language technologies in the meeting domain. For 2005, four evaluation tasks were supported. These included a speech-to-text (STT) transcription task and three diarization tasks: “Who Spoke When”, “Speech Activity Detection”, and “Source Localization.” The latter two were first-time experimental proof-of-concept tasks and were treated as “dry runs”. For the STT task, the lowest word error rate for the multiple distant microphone condition was 30.0% which represented an impressive 33% relative reduction from the best result obtained in the last such evaluation – the Rich Transcription Spring 2004 Meeting Recognition Evaluation. For the diarization “Who Spoke When” task, the lowest diarization error rate was 18.56% which represented a 19% relative reduction from that of RT-04S.