IEEE Transactions on Pattern Analysis and Machine Intelligence
Foundations of statistical natural language processing
Foundations of statistical natural language processing
The String-to-String Correction Problem
Journal of the ACM (JACM)
Models of translational equivalence among words
Computational Linguistics
A new algorithm for the alignment of phonetic sequences
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Determining recurrent sound correspondences by inducing translation models
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Computing word similarity and identifying cognates with pair hidden Markov models
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Cross-domain matching for automatic tag extraction across redundant handwriting and speech events
Proceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information
Adaptive string distance measures for bilingual dialect lexicon induction
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Can corpus based measures be used for comparative study of languages?
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Phonological reconstruction of a dead language using the gradual learning algorithm
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Translationese and its dialects
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Levenshtein distances fail to identify language relationships accurately
Computational Linguistics
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Recovering dialect geography from an unaligned comparable corpus
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Hi-index | 0.00 |
We investigate the problem of measuring phonetic similarity, focusing on the identification of cognates, words of the same origin in different languages. We compare representatives of two principal approaches to computing phonetic similarity: manually-designed metrics, and learning algorithms. In particular, we consider a stochastic transducer, a Pair HMM, several DBN models, and two constructed schemes. We test those approaches on the task of identifying cognates among Indoeuropean languages, both in the supervised and unsupervised context. Our results suggest that the averaged context DBN model and the Pair HMM achieve the highest accuracy given a large training set of positive examples.