Using Laplacian eigenmaps latent variable model and manifold learning to improve speech recognition accuracy

  • Authors:
  • Ayyoob Jafari;Farshad Almasganj

  • Affiliations:
  • Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran;Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran

  • Venue:
  • Speech Communication
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper demonstrates the application of the Laplacian eigenmaps latent variable model (LELVM) to the task of speech recognition. LELVM is a new dimension reduction method that combines the benefits of latent variable models-a multimodal probability density for latent and observed variables, and globally differentiable nonlinear mappings for the tasks of reconstruction and dimensionality reduction-with spectral manifold learning methods-no local optimum, ability to unfold nonlinear manifolds, and excellent practical scaling to latent spaces of high dimensions. LELVM is achieved by defining an out-of-sample mapping for Laplacian eigenmaps using a semi-supervised learning procedure. LELVM is simple, non-parametric and computationally inexpensive. In this research, LELVM is used to project MFCC features to a new subspace which leads to more discrimination among different phonetic categories. To evaluate the performance of the proposed feature modification system, a HMM-based speech recognition system and TIMIT speech database are employed. The experiments represent about 5% of the accuracy improvement in an isolated phoneme recognition task. The experiments imply the superiority of the proposed method to the usual PCA methods. Moreover, the proposed method keeps its benefits in noisy environments and does not degrade in such conditions.