Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

Authors:
Man-Wai Mak;Wei Rao
Affiliations:
Center for Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Center for Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
Venue:
Speech Communication
Year:
2011

Citing 10
Cited 0

Support Vector Machines for Classification in Nonstandard Situations

Machine Learning
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution

IEEE Transactions on Knowledge and Data Engineering
Support vector machines and Joint Factor Analysis for speaker verification

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Several SVM Ensemble Methods Integrated with Under-Sampling for Imbalanced Data Learning

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
On strategies for imbalanced text classification using SVM: A comparative study

Decision Support Systems
SVMs modeling for highly imbalanced classification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Guide to Biometrics

Guide to Biometrics
EUS SVMs: ensemble of under-sampled SVMs for data imbalance problems

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Combining Derivative and Parametric Kernels for Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent research has demonstrated the merit of combining Gaussian mixture models and support vector machine (SVM) for text-independent speaker verification. However, one unaddressed issue in this GMM-SVM approach is the imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique - namely utterance partitioning with acoustic vector resampling (UP-AVR) - to mitigate the data imbalance problem. Briefly, the sequence order of acoustic vectors in an enrollment utterance is first randomized, which is followed by partitioning the randomized sequence into a number of segments. Each of these segments is then used to produce a GMM supervector via MAP adaptation and mean vector concatenation. The randomization and partitioning processes are repeated several times to produce a sufficient number of speaker-class supervectors for training an SVM. Experimental evaluations based on the NIST 2002 and 2004 SRE suggest that UP-AVR can reduce the error rate of GMM-SVM systems.