An HMM approach to vowel restoration in Arabic and Hebrew
SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
Maximum entropy based restoration of Arabic diacritics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multi-agent Based Arabic Speech Recognition
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Syllable-based automatic Arabic speech recognition
ISPRA'08 Proceedings of the 7th WSEAS International Conference on Signal Processing, Robotics and Automation
Syllable-based automatic arabic speech recognition in noisy-telephone channel
WSEAS Transactions on Signal Processing
Arabic diacritic restoration approach based on maximum entropy models
Computer Speech and Language
Arabic diacritization through full morphological tagging
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic diacritization using weighted finite-state transducers
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Automatic diacritization for low-resource languages using a hybrid word and consonant CMM
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Decision trees for lexical smoothing in statistical machine translation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Phonetically rich and balanced text and speech corpora for Arabic language
Language Resources and Evaluation
Hi-index | 0.00 |
Automatic recognition of Arabic dialectal speech is a challenging task because Arabic dialects are essentially spoken varieties. Only few dialectal resources are available to date; moreover, most available acoustic data collections are transcribed without diacritics. Such a transcription omits essential pronunciation information about a word, such as short vowels. In this paper we investigate various procedures that enable us to use such training data by automatically inserting the missing diacritics into the transcription. These procedures use acoustic information in combination with different levels of morphological and contextual constraints. We evaluate their performance against manually diacritized transcriptions. In addition, we demonstrate the effect of their accuracy on the recognition performance of acoustic models trained on automatically diacritized training data.