Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

  • Authors:
  • Man-Wai Mak;Wei Rao

  • Affiliations:
  • Center for Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Center for Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong

  • Venue:
  • Speech Communication
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent research has demonstrated the merit of combining Gaussian mixture models and support vector machine (SVM) for text-independent speaker verification. However, one unaddressed issue in this GMM-SVM approach is the imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique - namely utterance partitioning with acoustic vector resampling (UP-AVR) - to mitigate the data imbalance problem. Briefly, the sequence order of acoustic vectors in an enrollment utterance is first randomized, which is followed by partitioning the randomized sequence into a number of segments. Each of these segments is then used to produce a GMM supervector via MAP adaptation and mean vector concatenation. The randomization and partitioning processes are repeated several times to produce a sufficient number of speaker-class supervectors for training an SVM. Experimental evaluations based on the NIST 2002 and 2004 SRE suggest that UP-AVR can reduce the error rate of GMM-SVM systems.