i-Vector with sparse representation classification for speaker verification

  • Authors:
  • Jia Min Karen Kua;Julien Epps;Eliathamby Ambikairajah

  • Affiliations:
  • School of Electrical Engineering and Telecommunications, The University of New South Wales, UNSW Sydney, NSW 2052, Australia;School of Electrical Engineering and Telecommunications, The University of New South Wales, UNSW Sydney, NSW 2052, Australia and ATP Research Laboratory, National ICT Australia (NICTA), Eveleigh 2 ...;School of Electrical Engineering and Telecommunications, The University of New South Wales, UNSW Sydney, NSW 2052, Australia and ATP Research Laboratory, National ICT Australia (NICTA), Eveleigh 2 ...

  • Venue:
  • Speech Communication
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sparse representation-based methods have very lately shown promise for speaker recognition systems. This paper investigates and develops an i-vector based sparse representation classification (SRC) as an alternative classifier to support vector machine (SVM) and Cosine Distance Scoring (CDS) classifier, producing an approach we term i-vector-sparse representation classification (i-SRC). Unlike SVM which fixes the support vector for each target example, SRC allows the supports, which we term sparse coefficient vectors, to be adapted to the test signal being characterized. Furthermore, similarly to CDS, SRC does not require a training phase. We also analyze different types of sparseness methods and dictionary composition to determine the best configuration for speaker recognition. We observe that including an identity matrix in the dictionary helps to remove sensitivity to outliers and that sparseness methods based on @?"1 and @?"2 norm offer the best performance. A combination of both techniques achieves a 18% relative reduction in EER over a SRC system based on @?"1 norm and without identity matrix. Experimental results on NIST 2010 SRE show that the i-SRC consistently outperforms i-SVM and i-CDS in EER by 0.14-0.81%, and the fusion of i-CDS and i-SRC achieves a relative EER reduction of 8-19% over i-SRC alone.