An Automatic Speaker Recognition System

Authors:
P. Chakraborty;F. Ahmed;Md. Monirul Kabir;Md. Shahjahan;Kazuyuki Murase
Affiliations:
Department of Electrical & Electronic Engineering, Khulna University of Engineering and Technology, Khulna, Bangladesh 920300;Department of Electrical & Electronic Engineering, Khulna University of Engineering and Technology, Khulna, Bangladesh 920300;Dept. of Human and Artificial Intelligence Systems, Graduate School of Engineering,;Department of Electrical & Electronic Engineering, Khulna University of Engineering and Technology, Khulna, Bangladesh 920300;Dept. of Human and Artificial Intelligence Systems, Graduate School of Engineering, and Research and Education Program for Life Science, University of Fukui, Japan 910-8507
Venue:
Neural Information Processing
Year:
2007

Citing 1
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speaker Recognition is the process of identifying a speaker by analyzing spectral shape of the voice signal. This is done by extracting & matching the feature of voice signal. Mel-frequency Cepstrum Co-efficient (MFCC) is the feature extraction technique in which we will get some coefficients named Mel-Frequency Cepstrum coefficient. This Cepstrum Co-efficient is extracted feature. This extracted feature is taken as the input of Vector Quantization process. Vector Quantization (VQ) is the typical feature matching technique in which VQ codebook is generated by providing pre-defined spectral vectors for each speaker to cluster the training vectors in a training session. Finally test data are provided for searching the nearest neighbor to match that data with the trained data. The result is to recognize correctly the speakers where music & speech data (Both in English & Bengali format) are taken for the recognition process. The correct recognition is almost ninety percent. It is comparatively better than Hidden Markov model (HMM) & Artificial Neural network (ANN).