Towards automatic speaker retrieval for large multimedia archives

Authors:
Marijn Huijbregts;David van Leeuwen
Affiliations:
Radboud University Nijmegen, Nijmegen, Netherlands;Radboud University Nijmegen, Nijmegen, Netherlands
Venue:
Proceedings of the 3rd international workshop on Automated information extraction in media production
Year:
2010

Citing 5
Cited 1

The Rich Transcription 2007 Meeting Recognition Evaluation

Multimodal Technologies for Perception of Humans
The ICSI RT07s Speaker Diarization System

Multimodal Technologies for Perception of Humans
Comparison of scoring methods used in speaker recognition with Joint Factor Analysis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The AMI speaker diarization system for NIST RT06s meeting data

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Large-Scale Speaker Diarization for Long Recordings and Small Collections

IEEE Transactions on Audio, Speech, and Language Processing

3rd international workshop on automated information extraction in media production

Proceedings of the international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we discuss the challenges of scaling a speaker retrieval system for small audiovisual collections towards a speaker retrieval system for large audio (visual) archives. We show that with our large scale speaker diarization approach it is possible to perform query-by-example speaker retrieval; to search for audiovisual documents in which a particular person is talking. On a selection of the ICSI meeting corpus we obtain a Mean Average Precision of 0.49 and precision-at-ten of 0.70. On a much larger archive of three months of Dutch broadcast television we obtain a precision-at-ten of 0.52.