Speaker verification under degraded condition: a perceptual study

Authors:
Gayadhar Pradhan;S. R. Prasanna
Affiliations:
Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, India 781039;Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, India 781039
Venue:
International Journal of Speech Technology
Year:
2011

Citing 7
Cited 0

Comparing discrimination and recognition of unfamiliar voices

Speech Communication
Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
Epoch Extraction From Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features

IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition in Noisy Conditions

IEEE Transactions on Audio, Speech, and Language Processing
Significance of Vowel-Like Regions for Speaker Verification Under Degraded Conditions

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This study analyzes the effect of degradation on human and automatic speaker verification (SV) tasks. The perceptual test is conducted by the subjects having knowledge about speaker verification. An automatic SV system is developed using the Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture model (GMM). The human and automatic speaker verification performances are compared for clean train and different degraded test conditions. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. The perceptual cues that the human subjects used as speaker specific information are investigated and their importance in degraded condition is highlighted. The difference in the nature of human and automatic SV tasks is investigated in terms of falsely accepted and falsely rejected speech pairs. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. A discussion on human vs automatic speaker verification is carried out and the possibility of performance improvement of automatic speaker verification under degraded condition is suggested.