A clustering based feature selection method in spectro-temporal domain for speech recognition

  • Authors:
  • Nafiseh Esfandian;Farbod Razzazi;Alireza Behrad

  • Affiliations:
  • Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch, Tehran, Iran;Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch, Tehran, Iran;Faculty of Engineering, Shahed University, Tehran, Iran

  • Venue:
  • Engineering Applications of Artificial Intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spectro-temporal representation of speech has become one of the leading signal representation approaches in speech recognition systems in recent years. This representation suffers from high dimensionality of the features space which makes this domain unsuitable for practical speech recognition systems. In this paper, a new clustering based method is proposed for secondary feature selection/extraction in the spectro-temporal domain. In the proposed representation, Gaussian mixture models (GMM) and weighted K-means (WKM) clustering techniques are applied to spectro-temporal domain to reduce the dimensions of the features space. The elements of centroid vectors and covariance matrices of clusters are considered as attributes of the secondary feature vector of each frame. To evaluate the efficiency of the proposed approach, the tests were conducted for new feature vectors on classification of phonemes in main categories of phonemes in TIMIT database. It was shown that by employing the proposed secondary feature vector, a significant improvement was revealed in classification rate of different sets of phonemes comparing with MFCC features. The average achieved improvements in classification rates of voiced plosives comparing to MFCC features is 5.9% using WKM clustering and 6.4% using GMM clustering. The greatest improvement is about 7.4% which is obtained by using WKM clustering in classification of front vowels comparing to MFCC features.