Lithuanian Speech Recognition Using the English Recognizer

Authors:
Pijus Kasparaitis
Affiliations:
Department of Computer Science II, Faculty of Mathematics and Informatics, Vilnius University, Naugarduko 24, 03225 Vilnius, Lithuania, e-mail: pkasparaitis@yahoo.com
Venue:
Informatica
Year:
2008

Citing 7
Cited 1

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Language-independent and language-adaptive acoustic modeling for speech recognition

Speech Communication
Towards language independent acoustic modeling

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Building Medium-Vocabulary Isolated-Word Lithuanian HMM Speech Recognition System

Informatica
Development of HMM/Neural Network-Based Medium-Vocabulary Isolated-Word Lithuanian Speech Recognition System

Informatica
Automatic Transcription of Lithuanian Text Using Dictionary

Informatica
Toward acoustic models for languages with limited linguistic resources

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Advances on the use of the foreign language recognizer

COST'09 Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present work is concerned with speech recognition using a small or medium size vocabulary. The possibility to use the English speech recognizer for the recognition of Lithuanian was investigated. Two methods were used to deal with such problems: the expert-driven (knowledge-based) method and the data-driven one. Phonological systems of English and Lithuanian were compared on the basis of the knowledge of phonology, and relations between certain Lithuanian and English phonemes were established. Situations in which correspondences between the phonemes were to be established experimentally (i.e., using the data-driven method) and the English phonemes that best matched the Lithuanian sounds or their combinations (e.g., diphthongs) in such situations were identified. The results obtained were used for creating transcriptions of the Lithuanian names and surnames that were used in recognition experiments. The experiments without transcriptions, with a single transcription and with many transcriptions were carried on. The method that allowed finding a small number of best transcriptions was proposed. The recognition rate achieved was as follows: 84.2% with the vocabulary containing 500 word pairs.