Computation of Normalized Edit Distance and Applications
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computational dialectology in Irish Gaelic
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
N-gram similarity and distance
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Adaptive string distance measures for bilingual dialect lexicon induction
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Identifying linguistic structure in a quantitative analysis of dialect pronunciation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Can corpus based measures be used for comparative study of languages?
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Inducing sound segment differences using Pair Hidden Markov Models
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Creating a comparative dictionary of Totonac-Tepehua
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Evaluating the pairwise string alignment of pronunciations
LaTeCH-SHELT&R '09 Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
Computer Speech and Language
Word segmentation for dialect translation
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Levenshtein distances fail to identify language relationships accurately
Computational Linguistics
Dialect translation: integrating Bayesian co-segmentation models with pivot-based SMT
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
Recovering dialect geography from an unaligned comparable corpus
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Hi-index | 0.00 |
We examine various string distance measures for suitability in modeling dialect distance, especially its perception. We find measures superior which do not normalize for word length, but which are are sensitive to order. We likewise find evidence for the superiority of measures which incorporate a sensitivity to phonological context, realized in the form of n-grams--although we cannot identify which form of context (bigram, trigram, etc.) is best. However, we find no clear benefit in using gradual as opposed to binary segmental difference when calculating sequence distances.