Automatic speech recognition for under-resourced languages: application to Vietnamese language

Authors:
Viet-Bac Le;Laurent Besacier
Affiliations:
LIG laboratory, Joseph Fourier University, UMR CNRS, Grenoble Cedex 9, France;LIG laboratory, Joseph Fourier University, UMR CNRS, Grenoble Cedex 9, France
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2009

Citing 5
Cited 7

Language-independent and language-adaptive acoustic modeling for speech recognition

Speech Communication
The Karlsruhe-Verbmobil Speech Recognition Engine

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Tree-based state tying for high accuracy acoustic modelling

HLT '94 Proceedings of the workshop on Human Language Technology
Multilingual Speech Processing

Multilingual Speech Processing
Automatic clustering and generation of contextual questions for tied states in hidden Markov models

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01

Hybrid approach for language identification oriented to multilingual speech recognition in the basque context

HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I
Semantic speech recognition in the Basque context Part II: language identification for under-resourced languages

International Journal of Speech Technology
Semantic speech recognition in the Basque context Part I: cross-lingual approaches

International Journal of Speech Technology
Experiments for the selection of sub-word units in the Basque context for semantic tasks

International Journal of Speech Technology
Automatic speech recognition for under-resourced languages: A survey

Speech Communication
Eigentrigraphemes for under-resourced languages

Speech Communication
SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents our work in automatic speech recognition (ASR) in the context of under-resourced languages with application toVietnamese. Different techniques for bootstrapping acoustic models are presented. First, we present the use of acoustic-phonetic unit distances and the potential of crosslingual acoustic modeling for under-resourced languages. Experimental results on Vietnamese showed that with only a few hours of target language speech data, crosslingual context independent modeling worked better than crosslingual context dependent modeling. However, it was outperformed by the latter one, when more speech data were available. We concluded, therefore, that in both cases, crosslingual systems are better than monolingual baseline systems. The proposal of grapheme-based acoustic modeling, which avoids building a phonetic dictionary, is also investigated in our work. Finally, since the use of sub-word units (morphemes, syllables, characters, etc.) can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling for under-resourced languages, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. The proposed lattice combination scheme results in a relative syllable error rate reduction of 6.6% over the sentence MAP baseline method for a Vietnamese ASR task.