Multiple sequence alignments in linguistics

  • Authors:
  • Jelena Prokić;Martijn Wieling;John Nerbonne

  • Affiliations:
  • University of Groningen, The Netherlands;University of Groningen, The Netherlands;University of Groningen, The Netherlands

  • Venue:
  • LaTeCH-SHELT&R '09 Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this study we apply and evaluate an iterative pairwise alignment program for producing multiple sequence alignments, ALPHAMALIG (Alonso et al., 2004), using as material the phonetic transcriptions of words used in Bulgarian dialectological research. To evaluate the quality of the multiple alignment, we propose two new methods based on comparing each column in the obtained alignments with the corresponding column in a set of gold standard alignments. Our results show that the alignments produced by ALPHAMALIG correspond well with the gold standard alignments, making this algorithm suitable for the automatic generation of multiple string alignments. Multiple string alignment is particularly interesting for historical reconstruction based on sound correspondences.