Audio-Visual feature fusion for speaker identification
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
International Journal of Speech Technology
Hi-index | 0.00 |
The introduction of Gaussian Mixture Models (GMMs) in the field of speaker verification has led to very good results. This paper illustrates an evolution in state-of-the-art Speaker Verification by highlighting the contribution of recently established information theoretic based vector quantization technique. We explore the novel application of three different vector quantization algorithms, namely K-means, Linde-Buzo-Gray (LBG) and Information Theoretic Vector Quantization (ITVQ) for efficient speaker verification. The Expectation Maximization (EM) algorithm used by GMM requires a prohibitive amount of iterations to converge. In this paper, comparable alternatives to EM including K-means, LBG and ITVQ algorithm were tested. The GMM-ITVQ algorithm was found to be the most efficient alternative for the GMM-EM. It gives correct classification rates at a similar level to that of GMM-EM. Finally, representative performance benchmarks and system behaviour experiments on NIST SRE corpora are presented.