Framework for Choosing a Set of Syllables and Phonemes for Lithuanian Speech Recognition

  • Authors:
  • Sigita Laurinčiukaitė;Antanas Lipeika

  • Affiliations:
  • Recognition Processes Department, Institute of Mathematics and Informatics, Goštauto 12, LT-01108 Vilnius, Lithuania, e-mail: sigita.lau@mch.mii.lt, lipeika@ktl.mii.lt;Recognition Processes Department, Institute of Mathematics and Informatics, Goštauto 12, LT-01108 Vilnius, Lithuania, e-mail: sigita.lau@mch.mii.lt, lipeika@ktl.mii.lt

  • Venue:
  • Informatica
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a framework for making up a set of syllables and phonemes that subsequently is used in the creation of acoustic models for continuous speech recognition of Lithuanian. The target is to discover a set of syllables and phonemes that is of utmost importance in speech recognition. This framework includes operations with lexicon, and transcriptions of records. To facilitate this work, additional programs have been developed that perform word syllabification, lexicon adjustment, etc. Series of experiments were done in order to establish the framework and model syllable- and phoneme-based speech recognition. Dominance of a syllable in lexicon has improved speech recognition results and encouraged us to move away from a strict definition of syllable, i.e., a syllable becomes a simple sub-word unit derived from a syllable. Two sets of syllables and phonemes and two types of lexicons have been developed and tested. The best recognition accuracy achieved 56.67% ±0.33. The speech recognition system is based on Hidden Markov Models (HMM). The continuous speech corpus LRN0 was used for the speech recognition experiments.