Making large-scale support vector machine learning practical
Advances in kernel methods
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Automatic language recognition using acoustic features
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Compensation of Nuisance Factors for Speaker and Language Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Spoken Language Recognition Using Ensemble Classifiers
IEEE Transactions on Audio, Speech, and Language Processing
A Vector Space Modeling Approach to Spoken Language Identification
IEEE Transactions on Audio, Speech, and Language Processing
On Acoustic Diversification Front-End for Spoken Language Identification
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
This paper presents a strategy to optimize the phonotactic front-end for spoken language recognition. This is achieved by selecting a subset of phones from an existing phone recognizer's phone inventory such that only the phones that best discriminate each of the target languages are selected. Each such phone subset will be used to construct a target-oriented phone tokenizer (TOPT). In this study, we examine different approaches to construct such phone tokenizers for the front-end of a Parallel Phone Recognizers followed by Vector Space Modeling (PPR-VSM) system. We show that the target-oriented phone tokenizers derived from language-specific phone recognizers are more effective than the original parallel phone recognizers. Our experimental results also show that the target-oriented phone tokenizers derived from universal phone recognizers achieve better performance than those derived from language-specific phone recognizers. Using the proposed target-oriented phone tokenizers as the phonotactic front-end, the language recognition system performance is significantly improved without the need for additional training samples. We achieve an equal error rate (EER) of 1.27%, 1.42% and 2.73% on the NIST 1996, 2003 and 2007 LRE databases respectively for 30-s closed-set tests. This system is one of the subsystems in IIR's submission to NIST 2007 LRE.