Coreference annotation schema for an inflectional language

Authors:
Maciej Ogrodniczuk;Magdalena Zawisławska;Katarzyna Głowińska;Agata Savary
Affiliations:
Institute of Computer Science, Polish Academy of Sciences, Poland;Institute of Polish Language, Warsaw University, Poland;Lingventa, Poland;Laboratoire d'informatique, François Rabelais University Tours, France
Venue:
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Year:
2013

Citing 4
Cited 0

Simple coreference resolution with rich syntactic and semantic features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
AnCora-CO: Coreferentially annotated corpora for Spanish and Catalan

Language Resources and Evaluation
A morphosyntactic Brill Tagger for inflectional languages

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Translation-based projection for multilingual coreference resolution

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Creating a coreference corpus for an inflectional and free-word-order language is a challenging task due to specific syntactic features largely ignored by existing annotation guidelines, such as the absence of definite/indefinite articles (making quasi-anaphoricity very common), frequent use of zero subjects or discrepancies between syntactic and semantic heads. This paper comments on the experience gained in preparation of such a resource for an ongoing project (CORE), aiming at creating tools for coreference resolution. Starting with a clarification of the relation between noun groups and mentions, through definition of the annotation scope and strategies, up to actual decisions for borderline cases, we present the process of building the first, to our best knowledge, corpus of general coreference of Polish.