Arabic named entity recognition: using features extracted from noisy data

Authors:
Yassine Benajiba;Imed Zitouni;Mona Diab;Paolo Rosso
Affiliations:
Columbia University;IBM T.J. Watson Research Center, Yorktown Heights;Columbia University;Universidad Politécnica de Valencia
Venue:
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Year:
2010

Citing 9
Cited 2

An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Arabic Computational Morphology: Knowledge-based and Empirical Methods

Arabic Computational Morphology: Knowledge-based and Empirical Methods
Improving machine translation quality with automatic named entity recognition

EAMT '03 Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools: Resources and Tools for Building MT
Arabic named entity recognition using optimized feature sets

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mention detection crossing the language barrier

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Can one language bootstrap the other: a case study on event extraction

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Cross-Language Information Propagation for Arabic Mention Detection

ACM Transactions on Asian Language Information Processing (TALIP)

Arabic entity graph extraction using morphology, finite state machines, and graph transformations

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
A real time Named Entity Recognition system for Arabic text mining

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Building an accurate Named Entity Recognition (NER) system for languages with complex morphology is a challenging task. In this paper, we present research that explores the feature space using both gold and bootstrapped noisy features to build an improved highly accurate Arabic NER system. We bootstrap noisy features by projection from an Arabic-English parallel corpus that is automatically tagged with a baseline NER system. The feature space covers lexical, morphological, and syntactic features. The proposed approach yields an improvement of up to 1.64 F-measure (absolute).