A Vector Space Modeling Approach to Spoken Language Identification

Authors:
Haizhou Li;Bin Ma;Chin-Hui Lee
Affiliations:
Inst. for Infocomm Res., Singapore;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 15

Using SVM as back-end classifier for language identification

EURASIP Journal on Audio, Speech, and Music Processing - Intelligent Audio, Speech, and Music Processing Applications
Automatic Language Identification with Discriminative Language Characterization Based on SVM

IEICE - Transactions on Information and Systems
A target-oriented phonotactic front-end for spoken language recognition

IEEE Transactions on Audio, Speech, and Language Processing
Transliteration alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Arabic script language identifications using adaptive neural network

ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
Arabic script web page language identifications using decision tree neural networks

Pattern Recognition
Integration of complementary honerecognizers for phonotactic language recognition

ICICA'10 Proceedings of the First international conference on Information computing and applications
Improved N-grams approach for web page language identification

Transactions on computational collective intelligence V
Non-English response detection method for automated proficiency scoring system

IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Beat space segmentation and octave scale cepstral feature for sung language recognition in pop music

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Language recognition with language total variability

Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing
Maximum A Posteriori Linear Regression for language recognition

Expert Systems with Applications: An International Journal
A hierarchical language identification system for Indian languages

Digital Signal Processing
Universal attribute characterization of spoken languages for automatic spoken language recognition

Computer Speech and Language
Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel approach to automatic spoken language identification (LID) based on vector space modeling (VSM). It is assumed that the overall sound characteristics of all spoken languages can be covered by a universal collection of acoustic units, which can be characterized by the acoustic segment models (ASMs). A spoken utterance is then decoded into a sequence of ASM units. The ASM framework furthers the idea of language-independent phone models for LID by introducing an unsupervised learning procedure to circumvent the need for phonetic transcription. Analogous to representing a text document as a term vector, we convert a spoken utterance into a feature vector with its attributes representing the co-occurrence statistics of the acoustic units. As such, we can build a vector space classifier for LID. The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances. We evaluated the proposed VSM framework on 1996 and 2003 NIST Language Recognition Evaluation (LRE) databases, achieving an equal error rate (EER) of 2.75% and 4.02% in the 1996 and 2003 LRE 30-s tasks, respectively, which represents one of the best results reported on these popular tasks