Comparison of clustering methods: A case study of text-independent speaker modeling

Authors:
Tomi Kinnunen;Ilja Sidoroff;Marko Tuononen;Pasi Fränti
Affiliations:
Speech and Image Processing Unit, School of Computing, University of Eastern Finland, P.O. Box 111, FI-80101 Joensuu, Finland;Speech and Image Processing Unit, School of Computing, University of Eastern Finland, P.O. Box 111, FI-80101 Joensuu, Finland;Speech and Image Processing Unit, School of Computing, University of Eastern Finland, P.O. Box 111, FI-80101 Joensuu, Finland;Speech and Image Processing Unit, School of Computing, University of Eastern Finland, P.O. Box 111, FI-80101 Joensuu, Finland
Venue:
Pattern Recognition Letters
Year:
2011

Citing 26
Cited 2

A clustering algorithm using an evolutionary programming-based approach

Pattern Recognition Letters
Data clustering: a review

ACM Computing Surveys (CSUR)
Genetic algorithm with deterministic crossover for vector quantization

Pattern Recognition Letters
A comparison of cluster validity criteria for a mixture of normal distributed data

Pattern Recognition Letters
An experimental comparison of model-based clustering methods

Machine Learning
AANN: an alternative to GMM for pattern recognition

Neural Networks
Fuzzy C-Means Clustering-Based Speaker Verification

AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
Vector Quantization Based Gaussian Modeling for Speaker Verification

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 3
Speaker adaptation based on MAP estimation using fuzzy controller

Pattern Recognition Letters
Iterative shrinking method for clustering problems

Pattern Recognition
Accuracy of MFCC-based speaker recognition in series 60 device

EURASIP Journal on Applied Signal Processing
A tutorial on text-independent speaker verification

EURASIP Journal on Applied Signal Processing
Comparative evaluation of maximum a Posteriori vector quantization and gaussian mixture models in speaker verification

Pattern Recognition Letters
α-Gaussian mixture modelling for speaker recognition

Pattern Recognition Letters
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Vector Quantization Mappings for Speaker Verification

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

Computers and Electrical Engineering
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Study of Interspeaker Variability in Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
Real-time speaker identification and verification

IEEE Transactions on Audio, Speech, and Language Processing
Some new indexes of cluster validity

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
The complexity of the generalized Lloyd - Max problem (Corresp.)

IEEE Transactions on Information Theory
Fast and memory efficient implementation of the exact PNN

IEEE Transactions on Image Processing
A fast exact GLA based on code vector activity detection

IEEE Transactions on Image Processing
A Study on Universal Background Model Training in Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing

Optimal weight tuning method for unit selection cost functions in syllable based text-to-speech synthesis

Applied Soft Computing
Relative entropy fuzzy c-means clustering

Information Sciences: an International Journal

Quantified Score

Hi-index	0.11

Visualization

Abstract

Clustering is needed in various applications such as biometric person authentication, speech coding and recognition, image compression and information retrieval. Hundreds of clustering methods have been proposed for the task in various fields but, surprisingly, there are few extensive studies actually comparing them. An important question is how much the choice of a clustering method matters for the final pattern recognition application. Our goal is to provide a thorough experimental comparison of clustering methods for text-independent speaker verification. We consider parametric Gaussian mixture model (GMM) and non-parametric vector quantization (VQ) model using the best known clustering algorithms including iterative (K-means, random swap, expectation-maximization), hierarchical (pairwise nearest neighbor, split, split-and-merge), evolutionary (genetic algorithm), neural (self-organizing map) and fuzzy (fuzzy C-means) approaches. We study recognition accuracy, processing time, clustering validity, and correlation of clustering quality and recognition accuracy. Experiments from these complementary observations indicate clustering is not a critical task in speaker recognition and the choice of the algorithm should be based on computational complexity and simplicity of the implementation. This is mainly because of three reasons: the data is not clustered, large models are used and only the best algorithms are considered. For low-order models, choice of the algorithm, however, can have a significant effect.