Generation of Phonetic Units for Mixed-Language Speech Recognition Based on Acoustic and Contextual Analysis

Authors:
Chien-Lin Huang;Chung-Hsien Wu
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
2007

Citing 9
Cited 2

Computing the Singular Value Decomposition on the Connection Machine

IEEE Transactions on Computers
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multilingual phone models for vocabulary-independent speech recognition tasks

Speech Communication
Distributed meetings: a meeting capture and broadcasting system

Proceedings of the tenth ACM international conference on Multimedia
Generation of robust phonetic set and decision tree for Mandarin using chi-square testing

Speech Communication
Towards context sensitive information inference

Journal of the American Society for Information Science and Technology - Mathematical, logical, and formal methods in information retrieval
Tree-based state tying for high accuracy acoustic modelling

HLT '94 Proceedings of the workshop on Human Language Technology
Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs

IEEE Transactions on Audio, Speech, and Language Processing
Speech Sentence Compression Based on Speech Segment Extraction and Concatenation

IEEE Transactions on Multimedia

Is the contextual information relevant in text clustering by compression?

Expert Systems with Applications: An International Journal
Unknown word extraction from multilingual code-switching sentences

ROCLING '11 ROCLING 2011 Poster Papers

Quantified Score

Hi-index	14.98

Visualization

Abstract

This work presents a novel approach to generating phonetic units in order to recognize mixed-language or multilingual speech. Acoustic and contextual analysis is performed to characterize multilingual phonetic units for phone set creation. Acoustic likelihood is utilized for similarity estimation of phone models. The hyperspace analog to language (HAL) model is adopted for contextual modeling and contextual similarity estimation. A confusion matrix combining acoustic and contextual similarities between every two phonetic units is built for phonetic unit clustering. Multidimensional scaling (MDS) method is applied to the confusion matrix for reducing dimensionality. Experimental results indicate that the created phonetic set provides a compact and robust set that considers acoustic and contextual information for mixed-language or multilingual speech recognition.