Performances evaluation of GMM-UBM and GMM-SVM for speaker recognition in realistic world

  • Authors:
  • Nassim Asbai;Abderrahmane Amrouche;Mohamed Debyeche

  • Affiliations:
  • Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria;Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria;Speech Communication and Signal Processing Laboratory, Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Algiers, Algeria

  • Venue:
  • ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, an automatic speaker recognition system for realistic environments is presented. In fact, most of the existing speaker recognition methods, which have shown to be highly efficient under noise free conditions, fail drastically in noisy environments. In this work, features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC) extracted from the speech signal are used to train the Support Vector Machines (SVM) and Gaussian mixture model (GMM). To reduce the effect of noisy environments the cepstral mean subtraction (CMS) are applied on the MFCC. For both, GMM-UBM and GMM-SVM systems, 2048-mixture UBM is used. The recognition phase was tested with Arabic speakers at different Signal-to-Noise Ratio (SNR) and under three noisy conditions issued from NOISEX-92 data base. The experimental results showed that the use of appropriate kernel functions with SVM improved the global performance of the speaker recognition in noisy environments.