Probabilistic scoring using decision trees for fast and scalable speaker recognition

  • Authors:
  • Gilles Gonon;Frédéric Bimbot;Rémi Gribonval

  • Affiliations:
  • IRISA/METISS (CNRS & INRIA), Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France;IRISA/METISS (CNRS & INRIA), Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France;IRISA/METISS (CNRS & INRIA), Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France

  • Venue:
  • Speech Communication
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the context of fast and low cost speaker recognition, this article investigates several techniques based on decision trees. A new approach is introduced where the trees are used to estimate a score function rather than returning a decision among classes. This technique is developed to approximate the GMM log-likelihood ratio (LLR) score function. On top of this approach, different solutions are derived to improve the accuracy of the proposed trees. The first one studies the quantization of the LLR function to create classification trees on the LLR values. The second one makes use of knowledge on the GMM distribution of the acoustic features in order to build oblique trees. A third extension consists in using a low-complexity score function in each of the tree leaves. Series of comparative experiments are performed on the NIST 2005 speaker recognition evaluation data in order to evaluate the impact of the proposed improvements in terms of efficiency, execution time and algorithmic complexity. Considering a baseline system with an Equal Error Rate (EER) of 9.6% on the NIST 2005 evaluation, the best tree-based configuration achieves an EER of 12.9%, with a computational cost adapted to embedded devices and an execution time suitable for real-time speaker identification.