A comparison of session variability compensation approaches for speaker verification

  • Authors:
  • Mitchell McLaren;Robert Vogt;Brendan Baker;Sridha Sridharan

  • Affiliations:
  • Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia;Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia;Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia;Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia

  • Venue:
  • IEEE Transactions on Information Forensics and Security
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper compares two of the leading techniques for session variability compensation in the context of support vector machine (SVM) speaker verification using Gaussian mixture model (GMM) mean supervectors: joint factor analysis (JFA) modeling and nuisance attribute projection (NAP). Motivation for this comparison comes from the distinctly different domains in which these techniques are employed--the probabilistic GMM domain versus the discriminative SVM kernel. A theoretical analysis is given comparing the JFA and NAP approaches to variability compensation. The role of speaker factors in the factor analysis model is also contrasted against the scatter difference NAP objective of retaining speaker information in the SVM kernel space. These methods for retaining speaker variation are found to provide improved verification performance over the removal of channel effects alone. Overall, experimental results on the NIST 2006 and 2008 SRE corpora demonstrate the effectiveness of both JFA and NAP techniques for reducing the effects of variability. However, the overheads associated with the implementation of JFA may make NAP a more attractive technique due to its simple yet effective approach to variability compensation.