A noise robust feature extraction algorithm using joint wavelet packet subband decomposition and AR modeling of speech signals

  • Authors:
  • Bojan Kotnik;Zdravko Kačič

  • Affiliations:
  • Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova ul. 17, SI 2000 Maribor, Slovenia;Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova ul. 17, SI 2000 Maribor, Slovenia

  • Venue:
  • Signal Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.08

Visualization

Abstract

This paper presents a noise robust feature extraction algorithm NRFE using joint wavelet packet decomposition (WPD) and autoregressive (AR) modeling of a speech signal. In opposition to the short time Fourier transform (STFT)-based time-frequency signal representation, wavelet packet decomposition can lead to better representation of non-stationary parts of the speech signal (e.g. consonants). The vowels are well described with an AR model as in LPC analysis. The proposed Root-Log compression scheme is used to perform the computation of the wavelet packet parameters. The separately extracted WPD and AR-based parameters are combined together and then transformed with the usage of linear discriminant analysis (LDA) to finally produce a lower dimensional output feature vector. The noise robustness is improved with the application of proposed wavelet-based denoising algorithm with a modified soft thresholding procedure and time-frequency adaptive threshold. The proposed voice activity detector based on a skewness-to-kurtosis ratio of the LPC residual signal is used to effectively perform a frame-dropping principle. The speech recognition results achieved on Aurora 2 and Aurora 3 databases show overall performance improvement of 44.7% and 48.2% relative to the baseline MFCC front-end, respectively.