Attention, intentions, and the structure of discourse
Computational Linguistics
Readings in natural language processing
C4.5: programs for machine learning
C4.5: programs for machine learning
How far are we from (semi-) automatic annotation of anaphoric links in corpora?
ANARESOLUTION '97 Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts
Hi-index | 0.00 |
This paper introduces a methodology to analyse and resolve cases of coreference in dialogues in English and Portuguese. A four-attribute annotation to analyse cases of anaphora was used to analyse a sample of around three thousand cases in each language collected in dialogue corpora. The information thus gathered was analysed by means of exploratory and model-building statistical procedures. A probabilistic model was then built on the basis of aggregate combinations of categories across the four attributes. This model, in combination with direct observation of cases, was used to build an antecedent-likelihood theory, which is at present being organised as a decision tree for the purpose of testing with a view for automatic annotation and subsequent resolution of coreference cases in dialogues in both languages. It is thought that the findings could be extended to Spanish, Italian and possibly French.