Time and Frequency Pruning for Speaker Identification

Authors:
Affiliations:
Venue:
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
Year:
1998

Citing 2
Cited 1

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Subband Approach for Automatic Speaker Recognition: Optimal Division of the Frequency Domain

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication

Significance of joint features derived from the modified group delay function in speech processing

EURASIP Journal on Audio, Speech, and Music Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work is an attempt to refine decisions in speaker identification. A test utterance is divided into multiple time-frequency blocks on which a normalized likelihood score is calculated. Instead of averaging the block-likelihoods along the whole test utterance, some of them are rejected (pruning) and the final score is computed with a limited number of time-frequency blocks. The results obtained in the special case of time pruning lead the authors to experiment a joint time and frequency pruning approach. The optimal percentage of blocks pruned is learned on a tuning data set with the minimum identification error criterion. Validation of the time-frequency pruning process on 567 speakers leads to a significant error rate reduction (up to 41% reduction on TIMIT) for short training and test duration.