Evaluation of EMD-based speaker recognition using ISCSLP2006 chinese speaker recognition evaluation corpus

Authors:
Shingo Kuroiwa;Satoru Tsuge;Masahiko Kita;Fuji Ren
Affiliations:
Faculty of Engineering, The University of Tokushima, Tokushimashi, Japan;Faculty of Engineering, The University of Tokushima, Tokushimashi, Japan;Faculty of Engineering, The University of Tokushima, Tokushimashi, Japan;Faculty of Engineering, The University of Tokushima, Tokushimashi, Japan
Venue:
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Year:
2006

Citing 2
Cited 0

Nonparametric Speaker Recognition Method Using Earth Mover's Distance

IEICE - Transactions on Information and Systems
Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

IEICE - Transactions on Information and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present the evaluation results of our proposed text-independent speaker recognition method based on the Earth Mover’s Distance (EMD) using ISCSLP2006 Chinese speaker recognition evaluation corpus developed by the Chinese Corpus Consortium (CCC). The EMD based speaker recognition (EMD-SR) was originally designed to apply to a distributed speaker identification system, in which the feature vectors are compressed by vector quantization at a terminal and sent to a server that executes a pattern matching process. In this structure, we had to train speaker models using quantized data, so that we utilized a non-parametric speaker model and EMD. From the experimental results on a Japanese speech corpus, EMD-SR showed higher robustness to the quantized data than the conventional GMM technique. Moreover, it has achieved higher accuracy than the GMM even if the data were not quantized. Hence, we have taken the challenge of ISCSLP2006 speaker recognition evaluation by using EMD-SR. Since the identification tasks defined in the evaluation were on an open-set basis, we introduce a new speaker verification module in this paper. Evaluation results showed that EMD-SR achieves 99.3% Identification Correctness Rate in a closed-channel speaker identification task.