Robust speaker segmentation for meetings: the ICSI-SRI spring 2005 diarization system
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Robust speech activity detection in interactive smart-room environments
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
The rich transcription 2006 spring meeting recognition evaluation
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
The AMI meeting transcription system: progress and performance
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
The 2007 AMI(DA) System for Meeting Transcription
Multimodal Technologies for Perception of Humans
Progress in the AMIDA Speaker Diarization System for Meeting Data
Multimodal Technologies for Perception of Humans
Multimodal Technologies for Perception of Humans
The IBM RT07 Evaluation Systems for Speaker Diarization on Lecture Meetings
Multimodal Technologies for Perception of Humans
Annotation of heterogeneous multimedia content using automatic speech recognition
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Spoken term detection system based on combination of LVCSR and phonetic search
MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Towards automatic speaker retrieval for large multimedia archives
Proceedings of the 3rd international workshop on Automated information extraction in media production
A system for the semantic multimodal analysis of news audio-visual content
EURASIP Journal on Advances in Signal Processing
Hi-index | 0.00 |
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are based on the TNO and ICSI system submitted for RT05s. For the conference room evaluation Single Distant Microphone condition, the SAD results perform well at 4.23 % error rate, and the ‘HMM-BIC' SPKR results perform competatively at an error rate of 37.2 % including overlapping speech.