Morphological parsing and the lexicon
Lexical representation and process
Learning morpho-lexical probabilities from an untagged corpus with an application to Hebrew
Computational Linguistics
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Tagging and morphological disambiguation of Turkish text
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Morphological disambiguation by voting constraints
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Combining stochastic and rule-based methods for disambiguation in agglutinative languages
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Dependency parsing with an extended finite state approach
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A statistical information extraction system for Turkish
Natural Language Engineering
How do search engines respond to some non-English queries?
Journal of Information Science
Serial combination of rules and statistics: a case study in Czech tagging
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Minimally supervised morphological analysis by multimodal alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A unified language model for large vocabulary continuous speech recognition of Turkish
Signal Processing - Fractional calculus applications in signals and systems
Automatic learning of language model structure
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Context-based morphological disambiguation with random fields
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Part-of-speech tagging of modern hebrew text
Natural Language Engineering
Part-of-Speech Tagging Using Word Probability Based on Category Patterns
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
HunPos: an open source trigram tagger
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Multilingual noise-robust supervised morphological analysis using the WordFrame model
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Web-based frequency dictionaries for medium density languages
WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
A discriminative model for joint morphological disambiguation and dependency parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Morphological annotation of a corpus with a collaborative multiplayer game
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Verb analysis in a highly inflective language with an MFF algorithm
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Hi-index | 0.00 |
In this paper, we present statistical models for morphological disambiguation in Turkish. Turkish presents an interesting problem for statistical models since the potential tag set size is very large because of the productive derivational morphology. We propose to handle this by breaking up the morhosyntactic tags into inflectional groups, each of which contains the inflectional features for each (intermediate) derived form. Our statistical models score the probability of each morhosyntactic tag by considering statistics over the individual inflection groups in a trigram model. Among the three models that we have developed and tested, the simplest model ignoring the local morphotactics within words performs the best. Our best trigram model performs with 93.95% accuracy on our test data getting all the morhosyntactic and semantic features correct. If we are just interested in syntactically relevant features and ignore a very small set of semantic features, then the accuracy increases to 95.07%.