Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Monocular Video Foreground/Background Segmentation by Tracking Spatial-Color Gaussian Mixture Models
WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Unsupervised cluster discovery using statistics in scale space
Engineering Applications of Artificial Intelligence
A cluster-based wavelet feature extraction method and its application
Engineering Applications of Artificial Intelligence
Improved support vector clustering
Engineering Applications of Artificial Intelligence
Pattern classification models for classifying and indexing audio signals
Engineering Applications of Artificial Intelligence
Weighted k-means for density-biased clustering
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
A neural predictive coding feature extraction scheme in DCT domain for phoneme recognition
Neural Computing and Applications
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations
IEEE Transactions on Audio, Speech, and Language Processing
Auditory representations of acoustic signals
IEEE Transactions on Information Theory - Part 2
A spatially constrained mixture model for image segmentation
IEEE Transactions on Neural Networks
Engineering Applications of Artificial Intelligence
Beyond cross-domain learning: Multiple-domain nonnegative matrix factorization
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Spectro-temporal representation of speech has become one of the leading signal representation approaches in speech recognition systems in recent years. This representation suffers from high dimensionality of the features space which makes this domain unsuitable for practical speech recognition systems. In this paper, a new clustering based method is proposed for secondary feature selection/extraction in the spectro-temporal domain. In the proposed representation, Gaussian mixture models (GMM) and weighted K-means (WKM) clustering techniques are applied to spectro-temporal domain to reduce the dimensions of the features space. The elements of centroid vectors and covariance matrices of clusters are considered as attributes of the secondary feature vector of each frame. To evaluate the efficiency of the proposed approach, the tests were conducted for new feature vectors on classification of phonemes in main categories of phonemes in TIMIT database. It was shown that by employing the proposed secondary feature vector, a significant improvement was revealed in classification rate of different sets of phonemes comparing with MFCC features. The average achieved improvements in classification rates of voiced plosives comparing to MFCC features is 5.9% using WKM clustering and 6.4% using GMM clustering. The greatest improvement is about 7.4% which is obtained by using WKM clustering in classification of front vowels comparing to MFCC features.