An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
A target-oriented phonotactic front-end for spoken language recognition
IEEE Transactions on Audio, Speech, and Language Processing
Active SVM-based relevance feedback using multiple classifiers ensemble and features reweighting
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
In this paper, we study a novel approach to spoken language recognition using an ensemble of binary classifiers. In this framework, we begin by representing a speech utterance with a high-dimensional feature vector such as the phonotactic characteristics or the polynomial expansion of cepstral features. A binary classifier can be built based on such feature vectors. We adopt a distributed output coding strategy in ensemble classifier design, where we decompose a multiclass language recognition problem into many binary classification tasks, each of which addresses a language recognition subtask by using a component classifier. Then, we combine the results of the component classifiers to form an output code as a hypothesized solution to the overall language recognition problem. In this way, we effectively project high-dimensional feature vectors into a tractable low-dimensional space, yet maintaining language discriminative characteristics of the spoken utterances. By fusing the output codes from both phonotactic features and cepstral features, we achieve equal-error-rates of 1.38% and 3.20% for 30-s trials on the 2003 and 2005 NIST language recognition evaluation databases.