Maximum A Posteriori Linear Regression for language recognition

  • Authors:
  • Jinchao Yang;Xiang Zhang;Hongbin Suo;Li Lu;Jianping Zhang;Yonghong Yan

  • Affiliations:
  • Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

This paper proposes the use of Maximum A Posteriori Linear Regression (MAPLR) transforms as feature for language recognition. Rather than estimating the transforms using maximum likelihood linear regression (MLLR), MAPLR inserts the priori information of the transforms in the estimation process using maximum a posteriori (MAP) as the estimation criterion to drive the transforms. By multi MAPLR adaptation each language spoken utterance is convert to one discriminative transform supervector consist of one target language transform vector and other non-target transform vectors. SVM classifiers are employed to model the discriminative MAPLR transform supervector. This system can achieve performance comparable to that obtained with state-of-the-art approaches and better than MLLR. Experiment results on 2007 NIST Language Recognition Evaluation (LRE) databases show that relative decline in EER of 4% and on mincost of 9% are obtained after the language recognition system using MAPLR instead of MLLR in 30-s tasks, and further improvement is gained combining with state-of-the-art systems. It leads to gains of 6% on EER and 11% on minDCF comparing with the performance of the only combination of the MMI system and the GMM-SVM system.