A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification

Authors:
Shou-Chun Yin;R. Rose;P. Kenny
Affiliations:
McGill Univ., Montreal;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 3

Speaker Recognition Based on GMM with an Embedded TDNN

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
Acoustic segment modeling for speaker recognition

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Applying SVMs and weight-based factor analysis to unsupervised adaptation for speaker verification

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the issue of speaker variability and session variability in text-independent Gaussian mixture model (GMM)-based speaker verification. A speaker model adaptation procedure is proposed which is based on a joint factor analysis approach to speaker verification. It is shown in this paper that this approach facilitates the implementation of a progressive unsupervised adaptation strategy which is able to produce an improved model of speaker identity while minimizing the influence of channel variability. The paper also deals with the interaction between this model adaptation approach and score normalization strategies which act to reduce the variation in likelihood ratio scores. This issue is particularly important in establishing decision thresholds in practical speaker verification systems since the variability of likelihood ratio scores can increase as a result of progressive model adaptation. These adaptation methods have been evaluated under the adaptation paradigm defined under the NIST 2005 Speaker Recognition Evaluation Plan, which is based on conversation sides derived from telephone speech utterances. It was found that when target speaker models were trained from a single conversation, an equal error rate (EER) of 4.5% was obtained under the NIST unsupervised speaker adaptation scenario.