Constrained Viterbi decoding for embedded user-customised password speaker recognition

  • Authors:
  • Anthony Larcher;Jean-Francois Bonastre;John S. D. Mason

  • Affiliations:
  • University of Avignon, LIA, Avignon, France;University of Avignon, LIA, Avignon, France;Swansea University, Singleton Park, Swansea, UK

  • Venue:
  • Proceedings of the 2010 ACM Symposium on Applied Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Embedded speaker recognition in mobile devices could involve several ergonomic constraints and a limited amount of computing resources. GMM/UBM systems have proved their efficiency in more classical contexts where good accuracy depends on a relatively large quantity of speech data. The proposed GMM/UBM extension addresses the situations with limited resources and takes advantage from the temporal structure of speech by using client-customised utterances harnessed by a Markov model. New temporal information is then used to enhance discrimination with Viterbi decoding increasing the gap between client and impostor scores. Experiments on the MyIdea database are performed when impostors know the client-utterance and also when they do not, highlighting the potential of this new approach. A relative gain up to 64% in terms of EER is achieved when impostors do not know the client utterances and performance is equivalent to the GMM/UBM baseline system in other configurations.