The reconstruction engine: a computer implementation of the comparative method
Computational Linguistics - Special issue on computational phonology
Empirical methods for exploiting parallel texts
Empirical methods for exploiting parallel texts
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics
A new algorithm for the alignment of phonetic sequences
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Determining recurrent sound correspondences by inducing translation models
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Identifying cognates by phonetic and semantic similarity
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
A knowledge-rich approach to measuring the similarity between Bulgarian and Russian words
MRTECEEL '09 Proceedings of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages
Estimating the proximity between languages by their commonality in vocabulary structures
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Using context and phonetic features in models of etymological sound change
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Hi-index | 0.00 |
The determination of recurrent sound correspondences between languages is crucial for the identification of cognates, which are often employed in statistical machine translation for sentence and word alignment. In this paper, an algorithm designed for extracting noncompositional compounds from bitexts is shown to be capable of determining complex sound correspondences in bilingual wordlists. In experimental evaluation, a C++ implementation of the algorithm achieves approximately 90% recall and precision on authentic language data.