Feature and signal enhancement for robust speaker identification of g.729 decoded speech

  • Authors:
  • Kalpesh Raval;Ravi P. Ramachandran;Sachin S. Shetty;Brett Y. Smolenski

  • Affiliations:
  • Rowan University, Glassboro, NJ;Rowan University, Glassboro, NJ;Tennessee State University, Nashville, TN;Assured Information Security, Rome, NY

  • Venue:
  • ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

For wireless remote access security, there is an emerging need for biometric speaker identification systems (SID) to be robust to speech coding distortion. This paper presents results on a Gaussian mixture model (GMM) based SID system that is trained on clean speech and tested on the decoded speech of the G.729 codec. To mitigate the performance loss due to mismatched training and testing conditions, five robust features, two enhancement approaches and three fusion strategies are used. The first enhancement method is feature compensation based on the affine transform. The second is the McCree signal enhancement approach based on the spectral envelope information in the G.729 bit stream. Ensemble systems using decision level, score fusion and Borda count are studied. The best performance is obtained by performing signal enhancement, feature compensation and decision level fusion. This results in an identification success rate (ISR) of 89.8%.