Histogram equalization using centroids of fuzzy c-means of background speakers' utterances for speaker identification

Authors:
Myung-Jae Kim;Il-Ho Yang;Ha-Jin Yu
Affiliations:
School of Computer Science, University of Seoul, Seoul, Korea;School of Computer Science, University of Seoul, Seoul, Korea;School of Computer Science, University of Seoul, Seoul, Korea
Venue:
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Year:
2013

Citing 4
Cited 0

Efficient Implementation of the Fuzzy c-Means Clustering Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Digital image processing (2nd ed.)

Digital image processing (2nd ed.)
Cepstral domain segmental feature vector normalization for noise robust speech recognition

Speech Communication - Special issue on robust speech recognition
Modified Segmental Histogram Equalization for robust speaker verification

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel approach of histogram equalization for speaker recognition with short utterances which are not enough for building histograms. The proposed method clusters the features of randomly selected background speakers' utterances, and estimates the cumulative distribution using the centroids of the clusters sorted in ascending order and the samples of a short test utterance. The ranks are obtained from the test utterance and the sorted centroid set and the sum of the two ranks are used to estimate the cumulative distribution function. For the evaluation, we use ETRI PC database and simulate VoIP codecs for the test set. The system is compared with other feature normalization methods such as CMN, MVN and the conventional HEQ. Our proposed method reduces the error rates by 27.9%, 35.9%, and 30.1% relatively in the test environments: G.729, SILK and Speex, respectively.