Speaker modeling technique based on regression class for speaker identification with sparse training

  • Authors:
  • Zhonghua Fu;Rongchun Zhao

  • Affiliations:
  • School of Computer Science, Northwestern, Polytechnical University, Xi'an, P.R China;School of Computer Science, Northwestern, Polytechnical University, Xi'an, P.R China

  • Venue:
  • SINOBIOMETRICS'04 Proceedings of the 5th Chinese conference on Advances in Biometric Person Authentication
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Speaker modeling technique with sparse training data is an active branch of robust speaker recognition research This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR) RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences The eigenvoice analysis in RC can provide better discrimination ability between different speakers The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.