Phoneme recognition using ICA-based feature extraction and transformation

  • Authors:
  • Oh-Wook Kwon;Te-Won Lee

  • Affiliations:
  • School of Electrical and Computer Engineering, Chungbuk National University, 48 Gaesin-dong, Heungdeok-gu, Cheongju, Chungbuk 361-763, South Korea;Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA

  • Venue:
  • Signal Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.08

Visualization

Abstract

We investigate the use of independent component analysis (ICA) for speech feature extraction in speech recognition systems. Although initial research suggested that learning basis functions by ICA for encoding the speech signal in an efficient manner improved recognition accuracy, we observe that this may be true for a recognition tasks with little training data. However, when compared in a large training database to standard speech recognition features such as the mel frequency cepstral coefficients (MFCCs), the ICA-adapted basis functions perform poorly. This is mainly due to the resulting phase sensitivity of the learned speech basis functions and their time shift variance property. In contrast to image processing, phase information is not essential for speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions via the Hilbert transform. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The performance of the new feature is evaluated for phoneme recognition using the TIMIT speech database and compared with the standard MFCC feature. The phoneme recognition results show promising accuracy, which is comparable to the well-optimized MFCC features.