Comparison and classification of dialects
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Linguistic variation and computation
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Measuring Norwegian dialect distances using acoustic features
Speech Communication
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Inducing sound segment differences using Pair Hidden Markov Models
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
LD '06 Proceedings of the Workshop on Linguistic Distances
Evaluation of string distance algorithms for dialectology
LD '06 Proceedings of the Workshop on Linguistic Distances
Evaluating the pairwise string alignment of pronunciations
LaTeCH-SHELT&R '09 Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
Levenshtein distances fail to identify language relationships accurately
Computational Linguistics
Improving suffix tree clustering with new ranking and similarity measures
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Hi-index | 0.00 |
Dialect groupings can be discovered objectively and automatically by cluster analysis of phonetic transcriptions such as those found in a linguistic atlas. The first step in the analysis, the computation of linguistic distance between each pair of sites, can be computed as Levenshtein distance between phonetic strings. This correlates closely with the much more laborious technique of determining and counting isoglosses, and is more accurate than the more familiar metric of computing Hamming distance based on whether vocabulary entries match. In the actual clustering step, traditional agglomerative clustering works better than the top-down technique of partitioning around medoids. When agglomerative clustering of phonetic string comparison distances is applied to Gaelic, reasonable dialect boundaries are obtained, corresponding to national and (within Ireland) provincial boundaries.