Automatic anaphora resolution for norwegian (ARN)

  • Authors:
  • Gordana Ilić Holen

  • Affiliations:
  • University of Oslo, Department of Literature, Area Studies and European Languages, Blindern, Oslo

  • Venue:
  • DAARC'07 Proceedings of the 6th discourse anaphora and anaphor resolution conference on Anaphora: analysis, algorithms and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ARN system -- an Automatic Anaphora Resolution System for Norwegian -- is a rule-based anaphora resolution system that was designed on the basis of two existing systems for the English language: Mitkov's Original Approach with its later development MARS, and the rap system by Lappin and Leass. A substantial group of rules within these systems is based upon a super-rule supported by Centering theory, which gives preference to subjects candidates over objects candidates, and object candidates over candidates within adverbial and prepositional phrases. These rules cannot be applied to Norwegian, due to differences in information structure between Norwegian and English. Although there is a tendency in both languages to avoid conveying new information with the subject, Norwegian goes to much greater lengths to avoid it. This tendency leads to a substantially higher number of sentences with expletive subjects in Norwegian than in English, rendering those subjects unsuitable as antecedent candidates. Making a complex preference to handle the sex/gender conflict and giving preference to pronominal candidates and candidates in close proximity to the anaphor has proved to be a good strategy for Norwegian. ARN was designed to resolve the third person pronoun with the exception of pronoun det 'it (neut.)', and has achieved an accuracy of 70.5%.