Maximum A Posteriori Linear Regression for language recognition

Authors:
Jinchao Yang;Xiang Zhang;Hongbin Suo;Li Lu;Jianping Zhang;Yonghong Yan
Affiliations:
Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 3
Cited 0

Elliptically contoured models in statistics

Elliptically contoured models in statistics
SVMTorch: support vector machines for large-scale regression problems

The Journal of Machine Learning Research
A Vector Space Modeling Approach to Spoken Language Identification

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	12.05

Visualization

Abstract

This paper proposes the use of Maximum A Posteriori Linear Regression (MAPLR) transforms as feature for language recognition. Rather than estimating the transforms using maximum likelihood linear regression (MLLR), MAPLR inserts the priori information of the transforms in the estimation process using maximum a posteriori (MAP) as the estimation criterion to drive the transforms. By multi MAPLR adaptation each language spoken utterance is convert to one discriminative transform supervector consist of one target language transform vector and other non-target transform vectors. SVM classifiers are employed to model the discriminative MAPLR transform supervector. This system can achieve performance comparable to that obtained with state-of-the-art approaches and better than MLLR. Experiment results on 2007 NIST Language Recognition Evaluation (LRE) databases show that relative decline in EER of 4% and on mincost of 9% are obtained after the language recognition system using MAPLR instead of MLLR in 30-s tasks, and further improvement is gained combining with state-of-the-art systems. It leads to gains of 6% on EER and 11% on minDCF comparing with the performance of the only combination of the MMI system and the GMM-SVM system.