Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
On the syllabification of phonemes
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
In this paper, a rule-based automatic syllabification Algorithm for Amharic language using linguistic implementation notions is designed following the Maximal Onset and Sonority Hierarchy principles. Amharic is a syllabic language in which every grapheme represents consonant-vowel assimilation. However, while reading a text in Amharic, all the CV syllables are not uttered as expected and hence the syllables in the text are not the CV sequence seen in the grapheme sequence. Epenthesis and gemination are also major challenges in Amharic grapheme-to-phoneme conversion because of the failure of Amharic orthography to show epenthetic vowel and geminated consonants. This limits the performance of many Amharic speech systems (such as Text-To-Speech and Automatic Speech Recognition) and other natural language applications. After a thorough study of the syllable structure, identification of linguistic syllabification rules and a survey of the relevant literature, a set of rules were identified and used to design a syllabification algorithm. The system was implemented and tested. The experiment was conducted using carefully selected Amharic words. The system exhibited a 98.1% word accuracy rate with very high sensitivity to epenthesis.