Phonetic string matching: lessons from information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical methods for speech recognition
Statistical methods for speech recognition
Foundations of statistical natural language processing
Foundations of statistical natural language processing
The String-to-String Correction Problem
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
Fuzzy translation of cross-lingual spelling variants
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Bitext maps and alignment via pattern recognition
Computational Linguistics
A new algorithm for the alignment of phonetic sequences
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Alignment of multiple languages for historical comparison
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Linguistic variation and computation
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Combining clues for word alignment
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Significance tests for the evaluation of ranking methods
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Identification of confusable drug names: a new approach and evaluation methodology
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Induction of cross-language affix and letter sequence correspondence
CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
Multiple word alignment with profile hidden Markov models
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Computing and historical phonology
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Inducing sound segment differences using Pair Hidden Markov Models
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Evaluation of several phonetic similarity algorithms on the task of cognate identification
LD '06 Proceedings of the Workshop on Linguistic Distances
Evaluating the pairwise string alignment of pronunciations
LaTeCH-SHELT&R '09 Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
Transliteration system using pair HMM with weighted FSTs
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Mining transliterations from Wikipedia using pair HMMs
NEWS '10 Proceedings of the 2010 Named Entities Workshop
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Hi-index | 0.00 |
We present a system for computing similarity between pairs of words. Our system is based on Pair Hidden Markov Models, a variation on Hidden Markov Models that has been used successfully for the alignment of biological sequences. The parameters of the model are automatically learned from training data that consists of word pairs known to be similar. Our tests focus on the identification of cognates --- words of common origin in related languages. The results show that our system outperforms previously proposed techniques.