Robust regression fusion of GMM-UBM and GMM-SVM normalized scores using G729 bit-stream for speaker recognition over IP

  • Authors:
  • Dalila Yessad;Abderrahmane Amrouche

  • Affiliations:
  • LCPTS, Speech Communication and Signal Processing Lab., Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Alger 16111;LCPTS, Speech Communication and Signal Processing Lab., Faculty of Electronics and Computer Sciences, USTHB, Bab Ezzouar, Alger 16111

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel approach, based on robust regression with normalized score fusion (namely Normalized Scores following Robust Regression Fusion: NSRRF), is proposed for enhancement of speaker recognition over IP networks, which can be used both in Network Speaker Recognition (NSR) and Distributed Speaker Recognition (DSR) systems. In this framework, it is basically assumed that the speech must be encoded by G729 coder in client side, and then, transmitted at a server side, where the ASR systems are located. The Universal Background Gaussian Mixture Model (GMM-UBM) and Gaussian Supervector (GMM-SVM) with normalized scores are used for speaker recognition. In this work, Mel Frequency Cepstral Coefficient (MFCC) and Linear Prediction Cepstral Coefficient (LPCC), both of these features are derived from Line Spectral Pairs (LSP) extracted from G729 bit-stream over IP, constitute the features vectors. Experimental results, conducted with the LIA SpkDet system based on the ALIZE platform3 using ARADIGITS database, have shown in first that the proposed method using features extracted directly from G729 bit-stream reduces significantly the error rate and outperforms the baseline system in ASR over IP based on the resynthesized (reconstructed) speech obtained from the G729 decoder. In addition, the obtained results show that the proposed approach, based on scores normalization following robust regression fusion technique, achieves the best result and outperform the conventional ASR over IP network.