Adaptive string distance measures for bilingual dialect lexicon induction

Authors:
Yves Scherrer
Affiliations:
University of Geneva, Geneva, Switzerland
Venue:
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Year:
2007

Citing 8
Cited 5

Learning String-Edit Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Inference of string mappings for language technology

Inference of string mappings for language technology
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Multipath translation lexicon induction via bridge languages

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Inducing translation lexicons via diverse similarity measures and bridge languages

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Evaluation of several phonetic similarity algorithms on the task of cognate identification

LD '06 Proceedings of the Workshop on Linguistic Distances
Evaluation of string distance algorithms for dialectology

LD '06 Proceedings of the Workshop on Linguistic Distances

Word-based dialect identification with georeferenced rules

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Word segmentation for dialect translation

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Dialect translation: integrating Bayesian co-segmentation models with pivot-based SMT

DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
Learning word-level dialectal variation as phonological replacement rules using a limited parallel corpus

DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
Improving statistical machine translation for a resource-poor language using related resource-rich languages

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper compares different measures of graphemic similarity applied to the task of bilingual lexicon induction between a Swiss German dialect and Standard German. The measures have been adapted to this particular language pair by training stochastic transducers with the Expectation-Maximisation algorithm or by using handmade transduction rules. These adaptive metrics show up to 11% F-measure improvement over a static metric like Levenshtein distance.