Speaker indexing in audio archives using gaussian mixture scoring simulation

Authors:
Hagai Aronowitz;David Burshtein;Amihood Amir
Affiliations:
Department of Computer Science, Bar-Ilan University, Israel;School of Electrical Engineering, Tel-Aviv University, Israel;Department of Computer Science, Bar-Ilan University, Israel
Venue:
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Year:
2004

Citing 1
Cited 0

An overview of audio information retrieval

Multimedia Systems - Special issue on audio and multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speaker indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. In this paper an efficient method to simulate GMM scoring is presented. Simulation is done by fitting a GMM not only to every target speaker but also to every test utterance, and then computing the likelihood of the test call using these GMMs instead of using the original data. GMM simulation is used to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE and NIST-2004 speaker evaluation corpuses show that our approach maintains and sometimes exceeds the accuracy of the conventional GMM algorithm and achieves efficient indexing capabilities: 6000 times faster than a conventional GMM with 1% overhead in storage.