Sinhala grapheme-to-phoneme conversion and rules for schwa epenthesis

  • Authors:
  • Asanka Wasala;Ruvan Weerasinghe;Kumudu Gamage

  • Affiliations:
  • University of Colombo, Colombo, Sri Lanka;University of Colombo, Colombo, Sri Lanka;University of Colombo, Colombo, Sri Lanka

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating /schwa/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) conversion model achieves 98% accuracy.