The Rich Transcription 2007 Meeting Recognition Evaluation
Multimodal Technologies for Perception of Humans
The ICSI RT07s Speaker Diarization System
Multimodal Technologies for Perception of Humans
Comparison of scoring methods used in speaker recognition with Joint Factor Analysis
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The AMI speaker diarization system for NIST RT06s meeting data
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Large-Scale Speaker Diarization for Long Recordings and Small Collections
IEEE Transactions on Audio, Speech, and Language Processing
3rd international workshop on automated information extraction in media production
Proceedings of the international conference on Multimedia
Hi-index | 0.00 |
In this paper we discuss the challenges of scaling a speaker retrieval system for small audiovisual collections towards a speaker retrieval system for large audio (visual) archives. We show that with our large scale speaker diarization approach it is possible to perform query-by-example speaker retrieval; to search for audiovisual documents in which a particular person is talking. On a selection of the ICSI meeting corpus we obtain a Mean Average Precision of 0.49 and precision-at-ten of 0.70. On a much larger archive of three months of Dutch broadcast television we obtain a precision-at-ten of 0.52.