Score fusion in text-dependent speaker recognition systems

Authors:
Jiří Mekyska;Marcos Faundez-Zanuy;Zdeněk Smékal;Joan Fàbregas
Affiliations:
Signal Processing Laboratory, Department of Telecommunications, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic;Escola Universitària Politècnica de Mataró, Barcelona, Spain;Signal Processing Laboratory, Department of Telecommunications, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic;Escola Universitària Politècnica de Mataró, Barcelona, Spain
Venue:
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Year:
2010

Citing 6
Cited 0

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
AANN: an alternative to GMM for pattern recognition

Neural Networks
Biometric dispersion matcher

Pattern Recognition
An efficient low cost approach for on-line signature recognition based on length normalization and fractional distances

Pattern Recognition
Biometric dispersion matcher versus LDA

Pattern Recognition
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

According to some significant advantages, the text-dependent speaker recognition is still widely used in biometric systems. These systems are, in comparison with the text-independent, more accurate and resistant against the replay attacks. There are many approaches regarding the text-dependent recognition. This paper introduces a combination of classifiers based on fractional distances, biometric dispersion matcher and dynamic time warping. The first two mentioned classifiers are based on a voice imprint. They have low memory requirements while the recognition procedure is fast. This is advantageous especially in low-cost biometric systems supplied by batteries. It is shown that using the trained score fusion, it is possible to reach successful detection rate equal to 98.98% and 92.19% in case of microphone mismatch. During verification, system reached equal error rate 2.55% and 6.77% when assuming the microphone mismatch. System was tested using Catalan database which consists of 48 speakers (three 3s training samples per speaker).