Exploiting morphology in Turkish named entity recognition system

Authors:
Reyyan Yeniterzi
Affiliations:
Carnegie Mellon University, Pittsburgh, PA
Venue:
HLT-SS '11 Proceedings of the ACL 2011 Student Session
Year:
2011

Citing 4
Cited 0

A statistical information extraction system for Turkish

Natural Language Engineering
Named entity recognition with character-level models

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Named Entity Recognition Experiments on Turkish Texts

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Turkish is an agglutinative language with complex morphological structures, therefore using only word forms is not enough for many computational tasks. In this paper we analyze the effect of morphology in a Named Entity Recognition system for Turkish. We start with the standard word-level representation and incrementally explore the effect of capturing syntactic and contextual properties of tokens. Furthermore, we also explore a new representation in which roots and morphological features are represented as separate tokens instead of representing only words as tokens. Using syntactic and contextual properties with the new representation provide an 7.6% relative improvement over the baseline.