DATR: a language for lexical knowledge representation
Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An unsupervised morpheme-based HMM for hebrew morphological disambiguation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Arabic diacritization through full morphological tagging
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Memory-based morphological analysis generation and part-of-speech tagging of Arabic
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Hi-index | 0.00 |
This paper describes a small experiment to test a rule-based approach to unknown word recognition in Arabic. The morphological complexity of Arabic presents its challenges to a variety of NLP applications, but it can also be viewed as an advantage, if we can tap into the complex linguistic knowledge associated with these complex forms. In particular, the derived forms of verbs can be analysed and an educated guess at the likely meaning of a derived form can be predicted, based on the meaning of a known form and the relationship between the known form and the unknown one. The performance of the approach is tested on the NEMLAR Written Arabic Corpus.