Supporting anaphor resolution in dialogues with a corpus-based probabilistic model

Authors:
Marco Rocha
Affiliations:
University of Sussex, Brighton, U.K.
Venue:
ANARESOLUTION '97 Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts
Year:
1997

Citing 5
Cited 1

Getting computers to talk like you and me

Getting computers to talk like you and me
Attention, intentions, and the structure of discourse

Computational Linguistics
Resolving pronoun references

Readings in natural language processing
Focusing in the comprehension of definite anaphora

Readings in natural language processing
Centering: a framework for modeling the local coherence of discourse

Computational Linguistics

PoCoS: Potsdam coreference scheme

LAW '07 Proceedings of the Linguistic Annotation Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a corpus-based investigation of anaphora in dialogues, using data from English and Portuguese face-to-face conversations. The approach relies on the manual annotation of a significant number of anaphora cases - around three thousand for each language - in order to create a database of real-life usage which ultimately aims at supporting anaphora interpreters in NLP systems. Each case of anaphora was annotated according to four properties described in the paper. The code used for the annotation is also described. Once the required number of cases had been analysed, a probabilistic model was built by linking categories in each property to form a probability tree. The results are summed up in an antecedent-likelihood theory, which elaborates on the probabilities and observed regularities of the immediate context to support anaphor resolution by selecting the most likely antecedent. The theory will be tested on a previously annotated dialogue and then fine-tuned for best performance. Automatic annotation is briefly discussed. Possible applications comprise machine translation, computer-aided language learning, and dialogue systems in general.