Introduction to artificial intelligence
Introduction to artificial intelligence
Encyclopedia of computer science (3rd ed.)
Encyclopedia of computer science (3rd ed.)
A PL/1 program to assist the comparative linguist
Communications of the ACM
Reconstructing prehistoric languages on the computer: the triumph of the electronic neogrammarian
COLING '73 Proceedings of the 5th conference on Computational linguistics - Volume 1
An algorithm to align words for historical comparison
Computational Linguistics
Alignment of multiple languages for historical comparison
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Determining recurrent sound correspondences by inducing translation models
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Identifying cognates by phonetic and semantic similarity
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Semi-supervised learning of partial cognates using bilingual bootstrapping
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improved reconstruction of protolanguage word forms
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Creating a comparative dictionary of Totonac-Tepehua
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Identifying complex sound correspondences in bilingual wordlists
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Finding cognate groups using phylogenies
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A statistical model for lost language decipherment
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Estimating the proximity between languages by their commonality in vocabulary structures
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Unsupervised multilingual learning
Unsupervised multilingual learning
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
We describe the implementation of a computer program, the Reconstruction Engine (RE), which models the comparative method for establishing genetic affiliation among a group of languages. The program is a research tool designed to aid the linguist in evaluating specific hypotheses, by calculating the consequences of a set of postulated sound changes (proposed by the linguist) on complete lexicons of several languages. It divides the lexicons into a phonologically regular part and a part that deviates from the sound laws. RE is bi-directional: given words in modern languages, it can propose cognate sets (with reconstructions); given reconstructions, it can project the modern forms that would result from regular changes. RE operates either interactively, allowing word-by-word evaluation of hypothesized sound changes and semantic shifts, or in a "batch" mode, processing entire multilingual lexicons en masse.We describe the algorithms implemented in RE, specifically the parsing and combinatorial techniques used to make projections upstream or downstream in the sense of time, the procedures for creating and consolidating cognate sets based on these projections, and the ad hoc techniques developed for handling the semantic component of the comparative method.Other programs and computational approaches to historical linguistics are briefly reviewed.Some results from a study of the Tamang languages of Nepal (a subgroup of the Tibeto-Burman family) are presented, and data from these languages are used throughout for exemplification of the operation of the program.Finally, we discuss features of RE that make it possible to handle the complex and sometimes imprecise representations of lexical items, and speculate on possible directions for future research.