Transliteration as constrained optimization

Authors:
Dan Goldwasser;Dan Roth
Affiliations:
University of Illinois, Urbana, IL;University of Illinois, Urbana, IL
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 10
Cited 13

Learning to resolve natural language ambiguities: a unified approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Linear concepts and hidden variables

Machine Learning
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Aggregation via set partitioning for natural language generation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Active sample selection for named entity transliteration

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Unsupervised named entity transliteration using temporal and phonetic correlation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
The necessity of syntactic parsing for semantic role labeling

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Beyond the pipeline: discrete optimization in NLP

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Learning phoneme mappings for transliteration without parallel data

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised constraint driven learning for transliteration discovery

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning better transliterations

Proceedings of the 18th ACM conference on Information and knowledge management
Report of NEWS 2009 machine transliteration shared task

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Transliteration of name entity via improved statistical translation on character sequences

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Combining MDL transliteration training with discriminative modeling

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Discriminative learning over constrained latent representations

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving the multilingual user experience of Wikipedia using cross-language name search

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Report of NEWS 2010 transliteration generation shared task

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration generation and mining with limited training resources

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration: leveraging on third languages

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Report of NEWS 2012 machine transliteration shared task

NEWS '12 Proceedings of the 4th Named Entity Workshop
MDL-based models for transliteration generation

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50% improvement in translating English NEs to Hebrew.