Performances evaluation of GMM-UBM and GMM-SVM for speaker recognition in realistic world

Authors:
Nassim Asbai;Abderrahmane Amrouche;Mohamed Debyeche
Affiliations:
Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria;Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria;Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria
Venue:
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Year:
2011

Citing 4
Cited 0

Support-Vector Networks

Machine Learning
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
A New Hybrid GMM/SVM for Speaker Verification

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
An efficient speech recognition system in adverse conditions using the nonparametric regression

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an automatic speaker recognition system for realistic environments is presented. In fact, most of the existing speaker recognition methods, which have shown to be highly efficient under noise free conditions, fail drastically in noisy environments. In this work, features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC) extracted from the speech signal are used to train the Support Vector Machines (SVM) and Gaussian mixture model (GMM). To reduce the effect of noisy environments the cepstral mean subtraction (CMS) are applied on the MFCC. For both, GMM-UBM and GMM-SVM systems, 2048-mixture UBM is used. The recognition phase was tested with Arabic speakers at different Signal-to-Noise Ratio (SNR) and under three noisy conditions issued from NOISEX-92 data base. The experimental results showed that the use of appropriate kernel functions with SVM improved the global performance of the speaker recognition in noisy environments.