The reconstruction engine: a computer implementation of the comparative method
Computational Linguistics - Special issue on computational phonology
Automatic identification of word translations from unrelated English and German corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Identifying cognates by phonetic and semantic similarity
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Learning a translation lexicon from monolingual corpora
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
Unsupervised analysis for decipherment problems
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Cross-lingual propagation for morphological analysis
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Writing systems, transliteration and decipherment
NAACL-Tutorials '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts
Entropy, the indus script, and language: A reply to r. sproat
Computational Linguistics
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Bayesian inference for Zodiac and other homophonic ciphers
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised multilingual learning
Unsupervised multilingual learning
Simple effective decipherment via combinatorial optimization
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Visualization of linguistic patterns and uncovering language history from multilingual resources
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Deciphering foreign language by combining language models and context vectors
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Name phylogeny: a generative model of string variation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Software helps linguists reconstruct, decipher ancient languages
Communications of the ACM
Hi-index | 0.02 |
In this paper we propose a method for the automatic decipherment of lost languages. Given a non-parallel corpus in a known related language, our model produces both alphabetic mappings and translations of words into their corresponding cognates. We employ a non-parametric Bayesian framework to simultaneously capture both low-level character mappings and high-level morphemic correspondences. This formulation enables us to encode some of the linguistic intuitions that have guided human decipherers. When applied to the ancient Semitic language Ugaritic, the model correctly maps 29 of 30 letters to their Hebrew counterparts, and deduces the correct Hebrew cognate for 60% of the Ugaritic words which have cognates in Hebrew.