Filterbank optimization for robust ASR using GA and PSO

  • Authors:
  • R. K. Aggarwal;M. Dave

  • Affiliations:
  • Department of Computer Engineering, N.I.T., Kurukshetra, India;Department of Computer Engineering, N.I.T., Kurukshetra, India

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic speech recognition (ASR) systems follow a well established approach of pattern recognition, that is signal processing based feature extraction at front-end and likelihood evaluation of feature vectors at back-end. Mel-frequency cepstral coefficients (MFCCs) are the features widely used in state-of-the-art ASR systems, which are derived by logarithmic spectral energies of the speech signal using Mel-scale filterbank. In filterbank analysis of MFCC there is no consensus for the spacing and number of filters used in various noise conditions and applications. In this paper, we propose a novel approach to use particle swarm optimization (PSO) and genetic algorithm (GA) to optimize the parameters of MFCC filterbank such as the central and side frequencies. The experimental results show that the new front-end outperforms the conventional MFCC technique. All the investigations are conducted using two separate classifiers, HMM and MLP, for Hindi vowels recognition in typical field condition as well as in noisy environment.