Improving the characterization of the alternative hypothesis via minimum verification error training with applications to speaker verification

  • Authors:
  • Yi-Hsiang Chao;Wei-Ho Tsai;Hsin-Min Wang;Ruei-Chuan Chang

  • Affiliations:
  • Institute of Information Science, Academia Sinica, Taipei 115, Taiwan and Department of Computer Science, National Chiao Tung University, Hsinchu 30010, Taiwan;Department of Electronic Engineering & Graduate Institute of Computer and Communication Engineering, National Taipei University of Technology, Taipei 10608, Taiwan;Institute of Information Science, Academia Sinica, Taipei 115, Taiwan;Department of Computer Science, National Chiao Tung University, Hsinchu 30010, Taiwan

  • Venue:
  • Pattern Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Speaker verification is usually formulated as a statistical hypothesis testing problem and solved by a log-likelihood ratio (LLR) test. A speaker verification system's performance is highly dependent on modeling the target speaker's voice (the null hypothesis) and characterizing non-target speakers' voices (the alternative hypothesis). However, since the alternative hypothesis involves unknown impostors, it is usually difficult to characterize a priori. In this paper, we propose a framework to better characterize the alternative hypothesis with the goal of optimally distinguishing the target speaker from impostors. The proposed framework is built on a weighted arithmetic combination (WAC) or a weighted geometric combination (WGC) of useful information extracted from a set of pre-trained background models. The parameters associated with WAC or WGC are then optimized using two discriminative training methods, namely, the minimum verification error (MVE) training method and the proposed evolutionary MVE (EMVE) training method, such that both the false acceptance probability and the false rejection probability are minimized. Our experiment results show that the proposed framework outperforms conventional LLR-based approaches.