A Vector Space Modeling Approach to Spoken Language Identification
IEEE Transactions on Audio, Speech, and Language Processing
On Acoustic Diversification Front-End for Spoken Language Identification
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
This paper takes an investigation into building and fusing multiple phone recognizers in the phonotactic system for language recognition. The phone recognizers are built using both phonetic and acoustic diversification. The phonetic diversification is achieved by training multiple phone recognizers on speech corpus of different languages. While the acoustic diversification is implemented in several ways, including using different acoustic features, different phone modeling techniques and training paradigms. As some phone recognizers are highly correlated with each other, we propose a performance optimization (PO) criterion to select a set of complementary phone recognizers for fusion. Experimental results on the NIST 2007 Language Recognition Evaluation (LRE) 30-s test set show the effectiveness of the proposed approach.