Spoken Language Recognition Using Ensemble Classifiers

Authors:
Bin Ma;Haizhou Li;Rong Tong
Affiliations:
Inst. for Infocomm Res., Singapore;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 4

An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
A target-oriented phonotactic front-end for spoken language recognition

IEEE Transactions on Audio, Speech, and Language Processing
A new integrated SVM classifiers for relevance feedback content-based image retrieval using EM parameter estimation

Applied Soft Computing
Active SVM-based relevance feedback using multiple classifiers ensemble and features reweighting

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study a novel approach to spoken language recognition using an ensemble of binary classifiers. In this framework, we begin by representing a speech utterance with a high-dimensional feature vector such as the phonotactic characteristics or the polynomial expansion of cepstral features. A binary classifier can be built based on such feature vectors. We adopt a distributed output coding strategy in ensemble classifier design, where we decompose a multiclass language recognition problem into many binary classification tasks, each of which addresses a language recognition subtask by using a component classifier. Then, we combine the results of the component classifiers to form an output code as a hypothesized solution to the overall language recognition problem. In this way, we effectively project high-dimensional feature vectors into a tractable low-dimensional space, yet maintaining language discriminative characteristics of the spoken utterances. By fusing the output codes from both phonotactic features and cepstral features, we achieve equal-error-rates of 1.38% and 3.20% for 30-s trials on the 2003 and 2005 NIST language recognition evaluation databases.