A scale-rate filter selection method in the spectro-temporal domain for phoneme classification

Authors:
Mehdi Fartash;Saeed Setayeshi;Farbod Razzazi
Affiliations:
Department of Electrical and Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran;Department of Medical Radiation, Amirkabir University of Technology, Tehran, Iran;Department of Electrical and Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
Venue:
Computers and Electrical Engineering
Year:
2013

Citing 9
Cited 0

Theory of cellular automata: a survey

Theoretical Computer Science
Auditory cortical representations of speech signals for phoneme classification

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
High-pitch formant estimation by exploiting temporal change of pitch

IEEE Transactions on Audio, Speech, and Language Processing
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations

IEEE Transactions on Audio, Speech, and Language Processing
Speech Analysis in a Model of the Central Auditory System

IEEE Transactions on Audio, Speech, and Language Processing
The exploration/exploitation tradeoff in dynamic cellular genetic algorithms

IEEE Transactions on Evolutionary Computation
Genetic learning automata for function optimization

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Auditory representations of acoustic signals

IEEE Transactions on Information Theory - Part 2
A compressed domain scheme for classifying block edge patterns

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, there has been a significant increase in studies employing auditory models in speech recognition systems. In this paper, we propose a new evolutionary tuned feature extraction method by spectro-temporal analysis. In our proposed model, there is a special subspace for each phoneme with a specific best scale in the spectral filter and a specific best rate in the temporal filter. These two parameters were obtained by genetic cellular automata evolutionary algorithm. The extracted features from the specific subspace are classified by a binary one-versus-rest support vector machine. Finally, a multiclass classifier for all phonemes is employed by combining these sub-models. The proposed method improved the discrimination of phonemes significantly especially in highly confusable phonemes. To show the efficiency of the proposed feature sets, it was empirically compared with two baseline models. The achieved relative improvements are about 10% in classification rate for voiced plosives, unvoiced plosives and nasals; and about 7.38% for front vowels relative to the state of the art baseline model.