Data & Knowledge Engineering - NLDB2002
A statistical information extraction system for Turkish
Natural Language Engineering
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Web-assisted annotation, semantic indexing and search of television and radio news
WWW '05 Proceedings of the 14th international conference on World Wide Web
RitroveRAI: a web application for semantic indexing and hyperlinking of multimedia news
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Automatic rule learning exploiting morphological features for named entity recognition in Turkish
Journal of Information Science
Exploiting morphology in Turkish named entity recognition system
HLT-SS '11 Proceedings of the ACL 2011 Student Session
A hybrid named entity recognizer for Turkish
Expert Systems with Applications: An International Journal
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Named entity recognition (NER) is one of the main information extraction tasks and research on NER from Turkish texts is known to be rare. In this study, we present a rule-based NER system for Turkish which employs a set of lexical resources and pattern bases for the extraction of named entities including the names of people, locations, organizations together with time/date and money/percentage expressions. The domain of the system is news texts and it does not utilize important clues of capitalization and punctuation since they may be missing in texts obtained from the Web or the output of automatic speech recognition tools. The evaluation of the system is performed on news texts along with other genres encompassing child stories and historical texts, but as expected in case of manually engineered rule-based systems, it suffers from performance degradation on these latter genres of texts since they are distinct from the target domain of news texts. Furthermore, the system is evaluated on transcriptions of news videos leading to satisfactory results which is an important step towards the employment of NER during automatic semantic annotation of videos in Turkish. The current study is significant for its being the first rule-based approach to the NER task on Turkish texts with its evaluation on diverse text types.