Combining sample selection and error-driven pruning for machine learning of coreference rules

Authors:
Vincent Ng;Claire Cardie
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Year:
2002

Citing 9
Cited 14

Explicitly biased generalization

Computational Intelligence
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Improving Minority Class Prediction Using Case-Specific Feature Weights

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A machine learning approach to coreference resolution of noun phrases

Computational Linguistics - Special issue on computational anaphora resolution
Evaluating automated and manual acquisition of anaphora resolution strategies

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A model-theoretic coreference scoring scheme

MUC6 '95 Proceedings of the 6th conference on Message understanding
Improving machine learning approaches to coreference resolution

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Text and knowledge mining for coreference resolution

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Using decision trees for conference resolution

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Weakly supervised natural language learning without redundant views

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Anaphora resolution by antecedent identification followed by anaphoricity determination

ACM Transactions on Asian Language Information Processing (TALIP)
Bootstrapping coreference classifiers with multiple machine learning algorithms

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Machine learning for coreference resolution: from local classification to global ranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An NP-cluster based approach to coreference resolution

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A twin-candidate model for learning-based anaphora resolution

Computational Linguistics
Evaluating hybrid versus data-driven coreference resolution

DAARC'07 Proceedings of the 6th discourse anaphora and anaphor resolution conference on Anaphora: analysis, algorithms and applications
Disambiguation of the neuter pronoun and its effect on pronominal coreference resolution

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Semantic and syntactic features for dutch coreference resolution

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Supervised noun phrase coreference research: the first fifteen years

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Resolving event noun phrases to their verbal mentions

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Maximum metric score training for coreference resolution

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Methodological Review: Coreference resolution: A review of general methodologies and applications in the clinical domain

Journal of Biomedical Informatics
Random walks down the mention graphs for event coreference resolution

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most machine learning solutions to noun phrase coreference resolution recast the problem as a classification task. We examine three potential problems with this reformulation, namely, skewed class distributions, the inclusion of "hard" training instances, and the loss of transitivity inherent in the original coreference relation. We show how these problems can be handled via intelligent sample selection and error-driven pruning of classification rule-sets. The resulting system achieves an F-measure of 69.5 and 63.4 on the MUC-6 and MUC-7 coreference resolution data sets, respectively, surpassing the performance of the best MUC-6 and MUC-7 coreference systems. In particular, the system outperforms the best-performing learning-based coreference system to date.