Robustness study of free-text speaker identification and verification

Authors:
Yu-Hung Kao;John S. Baras;P. K. Rajasekaran
Affiliations:
University of Maryland and Texas lnstruments Incorporated, Dallas, TX;University of Maryland, College Park;Texas Instruments Incorporated, Dallas, TX
Venue:
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Year:
1993

Citing 4
Cited 0

Acoustical and environmental robustness in automatic speech recognition

Acoustical and environmental robustness in automatic speech recognition
Text-independent speaker verification by discriminator counting

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
RASTA-PLP speech analysis technique

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
SWITCHBOARD: telephone speech corpus for research and development

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Usable free-text speaker identification and voice verification systems must. exhibit robustness under varying operational couditions We study the degree of robustness provided by various signal processing techniques [1] [2] [3] by experimenting on a widely used long distance telephone data base [4] [5] [6]. This data base consists of data recorded at two different sites, with data from one site much poorer in quality than the other; further the recording equipment had been inadvertently changed for the later half of the sessious resulting in a significantly changed environment. Our study identifies the combination of techniques that provide consistent and significant improvements; our results surpass other published results [4] [5] [6] on the same task. Specifically, in the task of identifying 16 speakers. with training data from the recording prior to equipment change and testing on data from a set after the change (the most challenging condition), we obtain a correct identification rate of 87.5% with an average rank of 1.12; [4] obtains the hitherto best. result of 75% correct identification with an average rank of 1.56: without any robustness processing, the rate was only 12%. Detailed results on exhaustive expermentation are presented along with appropriate discussions.