Identifying complex sound correspondences in bilingual wordlists

Authors:
Grzegorz Kondrak
Affiliations:
Department of Computing Science, University of Alberta, Edmonton, AB, Canada
Venue:
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Year:
2003

Citing 8
Cited 5

The reconstruction engine: a computer implementation of the comparative method

Computational Linguistics - Special issue on computational phonology
Empirical methods for exploiting parallel texts

Empirical methods for exploiting parallel texts
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Machine transliteration

Computational Linguistics
A new algorithm for the alignment of phonetic sequences

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Determining recurrent sound correspondences by inducing translation models

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Identifying cognates by phonetic and semantic similarity

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Multipath translation lexicon induction via bridge languages

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

Revealing phonological similarities between related languages from automatically generated parallel corpora

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
A knowledge-rich approach to measuring the similarity between Bulgarian and Russian words

MRTECEEL '09 Proceedings of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages
Estimating the proximity between languages by their commonality in vocabulary structures

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Similarity patterns in words

EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Using context and phonetic features in models of etymological sound change

EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

Quantified Score

Hi-index	0.00

Visualization

Abstract

The determination of recurrent sound correspondences between languages is crucial for the identification of cognates, which are often employed in statistical machine translation for sentence and word alignment. In this paper, an algorithm designed for extracting noncompositional compounds from bitexts is shown to be capable of determining complex sound correspondences in bilingual wordlists. In experimental evaluation, a C++ implementation of the algorithm achieves approximately 90% recall and precision on authentic language data.