Unsupervised speaker recognition based on competition between self-organizing maps

Authors:
I. Lapidot;H. Guterman;A. Cohen
Affiliations:
Dept. of Software Eng., Negev Acad. Coll. of Eng., Beer-Sheva;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2002

Citing 0
Cited 10

Assessment of self-organizing map variants for clustering with application to redistribution of emotional speech patterns

Neurocomputing
Review: Speaker segmentation and clustering

Signal Processing
Tentacled Self-Organizing Map for Effective Data Extraction

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
Darwin phones: the evolution of sensing and inference on mobile phones

Proceedings of the 8th international conference on Mobile systems, applications, and services
Text-independent speaker verification using ant colony optimization-based selected features

Expert Systems with Applications: An International Journal
Text-Independent speaker identification in phoneme-independent subspace using PCA transformation

International Journal of Biometrics
Speaker diarization exploiting the eigengap criterion and cluster ensembles

IEEE Transactions on Audio, Speech, and Language Processing
Feature selection using ant colony optimization for text-independent speaker verification system

ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
A computational intelligence scheme for the prediction of the daily peak load

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method for clustering the speakers from unlabeled and unsegmented conversation (with known number of speakers), when no a priori knowledge about the identity of the participants is given. Each speaker was modeled by a self-organizing map (SOM). The SOMs were randomly initiated. An iterative algorithm allows the data move from one model to another and adjust the SOMs. The restriction that the data can move only in small groups but not by moving each and every feature vector separately force the SOMs to adjust to speakers (instead of phonemes or other vocal events). This method was applied to high-quality conversations with two to five participants and to two-speaker telephone-quality conversations. The results for two (both high- and telephone-quality) and three speakers were over 80% correct segmentation. The problem becomes even harder when the number of participants is also unknown. Based on the iterative clustering algorithm a validity criterion was also developed to estimate the number of speakers. In 16 out of 17 conversations of high-quality conversations between two and three participants, the estimation of the number of the participants was correct. In telephone-quality the results were poorer.