An algorithm for identifying cognates between related languages

  • Authors:
  • Jacques B. M. Guy

  • Affiliations:
  • Australian National University, Canberra, Australia

  • Venue:
  • ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
  • Year:
  • 1984

Quantified Score

Hi-index 0.00

Visualization

Abstract

The algorithm takes as only input a list of words, preferably but not necessarily in phonemic transcription, in any two putatively related languages, and sorts it into decreasing order of probable cognation. The processing of a 250-item bilingual list takes about five seconds of CPU time on a DEC KL1091, and requires 56 pages of core memory. The algorithm is given no information whatsoever about the phonemic transcription used, and even though cognate identification is carried out on the basis of a context-free one-for-one matching of individual characters, its cognation decisions are bettered by a trained linguist using more information only in cases of wordlists sharing less than 40% cognates and involving complex, multiple sound correspondences.