NADA: a robust system for non-referential pronoun detection

Authors:
Shane Bergsma;David Yarowsky
Affiliations:
Dept. of Computer Science and Human Language Technology Center of Excellence, Johns Hopkins University, US;Dept. of Computer Science and Human Language Technology Center of Excellence, Johns Hopkins University, US
Venue:
DAARC'11 Proceedings of the 8th international conference on Anaphora Processing and Applications
Year:
2011

Citing 17
Cited 2

An algorithm for pronominal anaphora resolution

Computational Linguistics
A New, Fully Automatic Version of Mitkov's Knowledge-Poor Pronoun Resolution Method

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Discourse deixis: reference to discourse segments

ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
Identifying anaphoric and non-anaphoric noun phrases to improve coreference resolution

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Resolving pronominal reference to abstract entities

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improving pronoun resolution using statistics-based semantic compatibility information

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
EM works for pronoun anaphora resolution

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Identifying non-referential it: a machine learning approach incorporating linguistically motivated patterns

FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Web-scale N-gram models for lexical disambiguation

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
An expectation maximization approach to pronoun resolution

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Creating robust supervised classifiers via web-scale N-gram data

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Storing the web in memory: space efficient language models with constant time retrieval

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Antelogue: pronoun resolution for text and dialogue

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
Faster and smaller N-gram language models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

BART goes multilingual: the UniTN/Essex submission to the CoNLL-2012 shared task

CoNLL '12 Joint Conference on EMNLP and CoNLL - Shared Task
Book review: discourse processing manfred stede university of potsdam morgan & claypool synthesis lectures on human language technologies, edited by graeme hirst, volume 15, 2011, ix+155 pp; paperbound, isbn 978-1-60845-734-2, $40.00; ebook, isbn 978-1-60845-735-9, $30.00 or by subscription

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present $\textsc{Nada}$ : the Non-Anaphoric Detection Algorithm. $\textsc{Nada}$ is a novel, publicly-available program that accurately distinguishes between the referential and non-referential pronoun it in raw English text. Like recent state-of-the-art approaches, $\textsc{Nada}$ uses very large-scale web $\mbox{N-gram}$ features, but $\textsc{Nada}$ makes these features practical by compressing the $\mbox{N-gram}$ counts so they can fit into computer memory. $\textsc{Nada}$ therefore operates as a fast, stand-alone system. $\textsc{Nada}$ also improves over previous web-scale systems by considering the entire sentence, rather than narrow context windows, via long-distance lexical features. $\textsc{Nada}$ very substantially outperforms other state-of-the-art systems in non-referential detection accuracy.