Robust speaker verification with state duration modeling

  • Authors:
  • Nestor Becerra Yoma;Tarciano Facco Pegoraro

  • Affiliations:
  • Electrical Engineering Department, University of Chile, Av. Tupper 2007, P.O. Box 412-3, Santiago, Chile;Ericsson do Brasil, Rodovia Ermênio de Oliveira Penteado km 55,5, Idaiatuba, SP, Brazil

  • Venue:
  • Speech Communication
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of state duration modeling in the Viterbi algorithm in a text-dependent speaker verification task. The results presented in this paper suggest that temporal constraints can lead to reductions of 10% and 20% in the error rates with signals corrupted by noise at SNR equal to 6 and 0 dB, respectively, and that the accurate statistical modeling of state duration (e.g. with gamma probability distribution) does not seem to be very relevant if maximal and minimal state duration restrictions are imposed. In contrast, temporal restrictions do not seem to give any improvement in a speaker verification task with clean speech or high SNR. It is also shown that state duration constraints can easily be applied with the likelihood normalization metrics based on speaker-dependent temporal parameters. Finally, the results here presented show that word position-dependent state duration parameters give no significant improvement when compared with the word position-independent approach if the coarticulation effect between contiguous words is low.