Automatic Detection of Arabic Non-Anaphoric Pronouns for Improving Anaphora Resolution

Authors:
Muhammad Abdul-Mageed
Affiliations:
Indiana University
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2011

Citing 3
Cited 1

Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Improving the identification of non-anaphoric it using support vector machines

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Identifying non-referential it: a machine learning approach incorporating linguistically motivated patterns

FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing

A two-step zero pronoun resolution by reducing candidate cardinality

PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Anaphora resolution is one of the most difficult tasks in NLP. The ability to identify non-referential pronouns before attempting an anaphora resolution task would be significant, since the system would not have to attempt resolving such pronouns and hence end up with fewer errors. In addition, the number of non-referential pronouns has been found to be non-trivial in many domains. The task of detecting non-referential pronouns could also be incorporated into a part-of-speech tagger or a parser, or treated as an initial step in semantic interpretation. In this article, I describe a machine learning method for identifying non-referential pronouns in an annotated subsegment of the Penn Arabic Treebank using three different feature settings. I achieve an accuracy of 97.22% with 52 different features extracted from a small window size of -5/+5 tokens surrounding each potentially non-referential pronoun.