Learning morpho-lexical probabilities from an untagged corpus with an application to Hebrew
Computational Linguistics
Building probabilistic models for natural language
Building probabilistic models for natural language
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Tagging English text with a probabilistic model
Computational Linguistics
Does Baum-Welch re-estimation help taggers?
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A second-order Hidden Markov Model for part-of-speech tagging
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Automatic tagging of Arabic text: from raw text to base phrase chunks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
A finite-state morphological grammar of Hebrew
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Choosing an optimal architecture for segmentation and POS-tagging of modern Hebrew
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
POS tagging of dialectal Arabic: a minimally supervised approach
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Noun phrase chunking in Hebrew: influence of lexical and morphological features
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Part-of-speech tagging of modern hebrew text
Natural Language Engineering
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Stat-XFER: a general search-based syntax-driven framework for machine translation
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Identification of transliterated foreign words in Hebrew script
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A new approach to lexical disambiguation of Arabic text
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Modeling syntactic context improves morphological segmentation
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Universal morphological analysis using structured nearest neighbor prediction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Statistical thesaurus construction for a morphologically rich language
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Joint evaluation of morphological segmentation and syntactic parsing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A rule-based approach to unknown word recognition in Arabic
SIGMORPHON '12 Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology
Part of speech tagging for arabic
Natural Language Engineering
Word segmentation, unknown-word resolution, and morphological agreement in a hebrew parsing system
Computational Linguistics
Hi-index | 0.01 |
Morphological disambiguation is the process of assigning one set of morphological features to each individual word in a text. When the word is ambiguous (there are several possible analyses for the word), a disambiguation procedure based on the word context must be applied. This paper deals with morphological disambiguation of the Hebrew language, which combines morphemes into a word in both agglutinative and fusional ways. We present an un-supervised stochastic model - the only resource we use is a morphological analyzer-which deals with the data sparseness problem caused by the affixational morphology of the Hebrew language.We present a text encoding method for languages with affixational morphology in which the knowledge of word formation rules (which are quite restricted in Hebrew) helps in the disambiguation. We adapt HMM algorithms for learning and searching this text representation, in such a way that segmentation and tagging can be learned in parallel in one step. Results on a large scale evaluation indicate that this learning improves disambiguation for complex tag sets. Our method is applicable to other languages with affix morphology.