RENAR: A Rule-Based Arabic Named Entity Recognition System

Authors:
Wajdi Zaghouani
Affiliations:
University of Pennsylvania
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2012

Citing 9
Cited 1

Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Arabic GramCheck: a grammar checker for Arabic: Research Articles

Software—Practice & Experience
Arabic Named Entity Recognition from Diverse Text Types

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
NERA: Named Entity Recognition for Arabic

Journal of the American Society for Information Science and Technology
Arabic named entity recognition using optimized feature sets

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
TAGARAB: a fast, accurate Arabic name recognizer using high-precision morphological analysis

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Voyellation automatique de l'arabe

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
The impact of morphological stemming on Arabic mention detection and coreference resolution

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages

A hybrid approach to Arabic named entity recognition

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Named entity recognition has served many natural language processing tasks such as information retrieval, machine translation, and question answering systems. Many researchers have addressed the name identification issue in a variety of languages and recently some research efforts have started to focus on named entity recognition for the Arabic language. We present a working Arabic information extraction (IE) system that is used to analyze large volumes of news texts every day to extract the named entity (NE) types person, organization, location, date, and number, as well as quotations (direct reported speech) by and about people. The named entity recognition (NER) system was not developed for Arabic, but instead a multilingual NER system was adapted to also cover Arabic. The Semitic language Arabic substantially differs from the Indo-European and Finno-Ugric languages currently covered. This article thus describes what Arabic language-specific resources had to be developed and what changes needed to be made to the rule set in order to be applicable to the Arabic language. The achieved evaluation results are generally satisfactory, but could be improved for certain entity types.