Bayesian identification of cognates and correspondences

Authors:
T. Mark Ellison
Affiliations:
University of Western Australia, and Analith Ltd
Venue:
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Year:
2007

Citing 6
Cited 1

A PL/1 program to assist the comparative linguist

Communications of the ACM
Natural language from artificial life

Artificial Life
Algorithms for language reconstruction

Algorithms for language reconstruction
Models of translational equivalence among words

Computational Linguistics
An algorithm for identifying cognates between related languages

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Measuring language divergence by intra-lexical comparison

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics

Levenshtein distances fail to identify language relationships accurately

Computational Linguistics

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents a Bayesian approach to comparing languages: identifying cognates and the regular correspondences that compose them. A simple model of language is extended to include these notions in an account of parent languages. An expression is developed for the posterior probability of child language forms given a parent language. Bayes' Theorem offers a schema for evaluating choices of cognates and correspondences to explain semantically matched data. An implementation optimising this value with gradient descent is shown to distinguish cognates from non-cognates in data from Polish and Russian.