Finding cognate groups using phylogenies

  • Authors:
  • David Hall;Dan Klein

  • Affiliations:
  • University of California, Berkeley;University of California, Berkeley

  • Venue:
  • ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A central problem in historical linguistics is the identification of historically related cognate words. We present a generative phylogenetic model for automatically inducing cognate group structure from unaligned word lists. Our model represents the process of transformation and transmission from ancestor word to daughter word, as well as the alignment between the words lists of the observed languages. We also present a novel method for simplifying complex weighted automata created during inference to counteract the otherwise exponential growth of message sizes. On the task of identifying cognates in a dataset of Romance words, our model significantly outperforms a baseline approach, increasing accuracy by as much as 80%. Finally, we demonstrate that our automatically induced groups can be used to successfully reconstruct ancestral words.