Comparative evaluation of feature normalization techniques for speaker verification

Authors:
Md Jahangir Alam;Pierre Ouellet;Patrick Kenny;Douglas O'Shaughnessy
Affiliations:
CRIM, Montreal, Canada and INRS-EMT, University of Quebec, Montreal, Canada;CRIM, Montreal, Canada;CRIM, Montreal, Canada;INRS-EMT, University of Quebec, Montreal, Canada
Venue:
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Year:
2011

Citing 1
Cited 1

Front-End Factor Analysis for Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing

Constrained temporal structure for text-dependent speaker verification

Digital Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates several feature normalization techniques for use in an i-vector speaker verification system based on a mixture probabilistic linear discriminant analysis (PLDA) model. The objective of the feature normalization technique is to compensate for the effects of environmental mismatch. Here, we study short-time Gaussianization (STG), short-time mean and variance normalization (STMVN), and short-time mean and scale normalization (STMSN) techniques. Our goal is to compare the performance of the above mentioned feature normalization techniques on the telephone (det5) and microphone speech (det1, det2, det3 and det4) of the NIST SRE 2010 corpora. Experimental results show that the performances of the STMVN and STMSN techniques are comparable to that of the STG technique.