Learning word-level dialectal variation as phonological replacement rules using a limited parallel corpus

  • Authors:
  • Mans Hulden;Iñaki Alegria;Izaskun Etxeberria;Montse Maritxalar

  • Affiliations:
  • University of Helsinki;IXA taldea, UPV-EHU;IXA taldea, UPV-EHU;IXA taldea, UPV-EHU

  • Venue:
  • DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
  • Year:
  • 2011

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper explores two different methods of learning dialectal morphology from a small parallel corpus of standard and dialect-form text, given that a computational description of the standard morphology is available. The goal is to produce a model that translates individual lexical dialectal items to their standard dialect counterparts in order to facilitate dialectal use of available NLP tools that only assume standard-form input. The results show that a learning method based on inductive logic programming quickly converges to the correct model with respect to many phonological and morphological differences that are regular in nature.