Spectral entropy and spectral shape based pre-quantization for real time speaker identification system

  • Authors:
  • Gourav Sarkar;Goutam Saha

  • Affiliations:
  • Department of Electronics and Electrical Communication Engineering, IIT Kharagpur, Pin, India 721302;Department of Electronics and Electrical Communication Engineering, IIT Kharagpur, Pin, India 721302

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pre-processing is one of the vital steps for developing robust and efficient recognition system. Better pre-processing not only aid in better data selection but also in significant reduction of computational complexity. Further an efficient frame selection technique can improve the overall performance of the system. Pre-quantization (PQ) is the technique of selecting less number of frames in the pre-processing stage to reduce the computational burden in the post processing stages of speaker identification (SI). In this paper, we develop PQ techniques based on spectral entropy and spectral shape to pick suitable frames containing speaker specific information that varies from frame to frame depending on spoken text and environmental conditions. The attempt is to exploit the statistical properties of distributions of speech frames at the pre-processing stage of speaker recognition. Our aim is not only to reduce the frame rate but also to maintain identification accuracy reasonably high. Further we have also analyzed the robustness of our proposed techniques on noisy utterances. To establish the efficacy of our proposed methods, we used two different databases, POLYCOST (telephone speech) and YOHO (microphone speech).