Modeling improved syllabification algorithm for Amharic

  • Authors:
  • Nirayo Hailu;Sebsibe Hailemariam

  • Affiliations:
  • Hawassa University Hawassa, Ethiopia;Addis Ababa University, Addis Ababa, Ethiopia

  • Venue:
  • Proceedings of the International Conference on Management of Emergent Digital EcoSystems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a rule-based automatic syllabification Algorithm for Amharic language using linguistic implementation notions is designed following the Maximal Onset and Sonority Hierarchy principles. Amharic is a syllabic language in which every grapheme represents consonant-vowel assimilation. However, while reading a text in Amharic, all the CV syllables are not uttered as expected and hence the syllables in the text are not the CV sequence seen in the grapheme sequence. Epenthesis and gemination are also major challenges in Amharic grapheme-to-phoneme conversion because of the failure of Amharic orthography to show epenthetic vowel and geminated consonants. This limits the performance of many Amharic speech systems (such as Text-To-Speech and Automatic Speech Recognition) and other natural language applications. After a thorough study of the syllable structure, identification of linguistic syllabification rules and a survey of the relevant literature, a set of rules were identified and used to design a syllabification algorithm. The system was implemented and tested. The experiment was conducted using carefully selected Amharic words. The system exhibited a 98.1% word accuracy rate with very high sensitivity to epenthesis.