Building bilingual microcomputer systems
Communications of the ACM
A maximum entropy approach to natural language processing
Computational Linguistics
A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Empirical studies in strategies for Arabic retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
A machine learning approach to coreference resolution of noun phrases
Computational Linguistics - Special issue on computational anaphora resolution
Nymble: a high-performance learning name-finder
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Named Entity recognition without gazetteers
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Improving machine learning approaches to coreference resolution
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A mention-synchronous coreference resolution algorithm based on the Bell tree
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Maximum entropy based restoration of Arabic diacritics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Arabic Named Entity Recognition from Diverse Text Types
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Arabic diacritic restoration approach based on maximum entropy models
Computer Speech and Language
When Harry met Harri: cross-lingual name spelling normalization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mention detection crossing the language barrier
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Morphology-Based Segmentation Combination for Arabic Mention Detection
ACM Transactions on Asian Language Information Processing (TALIP)
Cross-Language Information Propagation for Arabic Mention Detection
ACM Transactions on Asian Language Information Processing (TALIP)
Person name entity recognition for Arabic
Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Arabic Mention Detection: toward better unit of analysis
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Enhancing mention detection using projection via aligned corpora
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
An accuracy-enhanced light stemmer for arabic text
ACM Transactions on Speech and Language Processing (TSLP)
RENAR: A Rule-Based Arabic Named Entity Recognition System
ACM Transactions on Asian Language Information Processing (TALIP)
Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
Arabic presents an interesting challenge to natural language processing, being a highly inflected and agglutinative language. In particular, this paper presents an in-depth investigation of the entity detection and recognition (EDR) task for Arabic. We start by highlighting why segmentation is a necessary prerequisite for EDR, continue by presenting a finite-state statistical segmenter, and then examine how the resulting segments can be better included into a mention detection system and an entity recognition system; both systems are statistical, build around the maximum entropy principle. Experiments on a clearly stated partition of the ACE 2004 data show that stem-based features can significantly improve the performance of the EDT system by 2 absolute F-measure points. The system presented here had a competitive performance in the ACE 2004 evaluation.