Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A statistical profile of the Named Entity task
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Nymble: a high-performance learning name-finder
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Disambiguation of proper names in text
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
ACM Transactions on Asian Language Information Processing (TALIP)
A study in Urdu corpus construction
COLING '02 Proceedings of the 3rd workshop on Asian language resources and international standardization - Volume 12
Proceedings of the 2nd PhD workshop on Information and knowledge management
NE tagging for Urdu based on bootstrap POS learning
CLIAWS3 '09 Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies
A hybrid approach to Arabic named entity recognition
Journal of Information Science
Hi-index | 0.00 |
Named Entity Recognition or Extraction (NER) is an important task for automated text processing for industries and academia engaged in the field of language processing, intelligence gathering and Bioinformatics. In this paper we discuss the general problem of Named Entity Recognition, more specifically the challenges in NER in languages that do not have language resources e.g. large annotated corpora. We specifically address the challenges for Urdu NER and differentiate it from other South Asian (Indic) languages. We discuss the differences between Hindi and Urdu and conclude that the NER computational models for Hindi cannot be applied to Urdu. A rule-based Urdu NER algorithm is presented that outperforms the models that use statistical learning.