Optimizing features extraction parameters for speaker verification

Authors:
Donato Impedovo;Mario Refice
Affiliations:
Dipartimento di Elettrotecnica ed Elettronica, Politecnico di Bari, Bari, Italy;Dipartimento di Elettrotecnica ed Elettronica, Politecnico di Bari, Bari, Italy
Venue:
ICS'08 Proceedings of the 12th WSEAS international conference on Systems
Year:
2008

Citing 5
Cited 0

Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
The Influence of Frame Length on Speaker Identification Performance

IAS '07 Proceedings of the Third International Symposium on Information Assurance and Security
A fast prototyping system for speech recognition based on a visual object oriented environment

ISCGAV'05 Proceedings of the 5th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision
Pitch mean based frequency warping

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Pseudo pitch synchronous analysis of speech with applications to speaker recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper the role of the frame length on the computation of Mel Frequency Cepstral Coefficients (MFCCs) is investigated in a text-dependent speaker verification system. The variations of vocal characteristics of subjects along the time and the related information conveyed in the MFCCs cause a significant degradation on verification performance. In our experiments we tested the use of different frame lengths for feature extraction in the training and in the verification phases, for a set of speakers whose speech productions were spanned over approximately 3 months. Results show that a suitable choice of the frame lengths combination for training and testing phases can improve performance. The approach shows its potentialities up to 40% in ER reduction for female speakers and up to 58% for the male subset.