Front-End Factor Analysis for Speaker Verification

Authors:
N. Dehak;P. J. Kenny;R. Dehak;P. Dumouchel;P. Ouellet
Affiliations:
Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA;-;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2011

Citing 0
Cited 18

Automatic emotion recognition from speech a PhD research proposal

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Applying emotional factor analysis and I-vector to emotional speaker recognition

CCBR'11 Proceedings of the 6th Chinese conference on Biometric recognition
Comparative evaluation of feature normalization techniques for speaker verification

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Automatic speaker age and gender recognition using acoustic and prosodic level information fusion

Computer Speech and Language
Universal attribute characterization of spoken languages for automatic spoken language recognition

Computer Speech and Language
Is masking a relevant aspect lacking in MFCC? A speaker verification perspective

Pattern Recognition Letters
Multitaper MFCC and PLP features for speaker verification using i-vectors

Speech Communication
i-Vector with sparse representation classification for speaker verification

Speech Communication
Speaker verification in score-ageing-quality classification space

Computer Speech and Language
Comparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification

Pattern Recognition Letters
Compact bag-of-words visual representation for effective linear classification

Proceedings of the 21st ACM international conference on Multimedia
Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

Speech Communication
I-vector based speaker recognition using advanced channel compensation techniques

Computer Speech and Language
A study of voice activity detection techniques for NIST speaker recognition evaluations

Computer Speech and Language
A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Factorized Sub-Space Estimation for Fast and Memory Effective I-vector Extraction

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space because it models both speaker and channel variabilities. Two speaker verification systems are proposed which use this new representation. The first system is a support vector machine-based system that uses the cosine kernel to estimate the similarity between the input data. The second system directly uses the cosine similarity as the final decision score. We tested three channel compensation techniques in the total variability space, which are within-class covariance normalization (WCCN), linear discriminate analysis (LDA), and nuisance attribute projection (NAP). We found that the best results are obtained when LDA is followed by WCCN. We achieved an equal error rate (EER) of 1.12% and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset. We also obtained 4% absolute EER improvement for both-gender trials on the 10 s-10 s condition compared to the classical joint factor analysis scoring.