A systematic comparison of various statistical alignment models
Computational Linguistics
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Learning a translation lexicon from monolingual corpora
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Evaluation of several phonetic similarity algorithms on the task of cognate identification
LD '06 Proceedings of the Workshop on Linguistic Distances
Evaluation of string distance algorithms for dialectology
LD '06 Proceedings of the Workshop on Linguistic Distances
Hi-index | 0.00 |
This paper proposes a simple metric of dialect distance, based on the ratio between identical word pairs and cognate word pairs occurring in two texts. Different variations of this metric are tested on a corpus containing comparable texts from different Swiss German dialects and evaluated on the basis of spatial autocorrelation measures. The visualization of the results as cluster dendrograms shows that closely related dialects are reliably clustered together, while multidimensional scaling produces graphs that show high agreement with the geographic localization of the original texts.