Incorporating linguistic knowledge in statistical machine translation: translating prepositions

Authors:
Reshef Shilon;Hanna Fadida;Shuly Wintner
Affiliations:
Tel Aviv University, Israel;Technion, Israel;University of Haifa, Israel
Venue:
HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Year:
2012

Citing 13
Cited 0

Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Experiments with a Hindi-to-English transfer-based MT system under a miserly data scenario

ACM Transactions on Asian Language Information Processing (TALIP)
Automatic acquisition of subcategorization frames from untagged text

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatic acquisition of a large subcategorization dictionary from corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic extraction of subcategorization frames for Czech

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Handling of prepositions in English to Bengali machine translation

Prepositions '06 Proceedings of the Third ACL-SIGSEM Workshop on Prepositions
An improved statistical transfer system for French--English machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Stat-XFER: a general search-based syntax-driven framework for machine translation

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Putting pieces together: combining FrameNet, VerbNet and WordNet for robust semantic parsing

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Meteor 1.3: automatic metric for reliable optimization and evaluation of machine translation systems

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Agreement constraints for statistical machine translation into German

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
English to chinese translation of prepositions

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prepositions are hard to translate, because their meaning is often vague, and the choice of the correct preposition is often arbitrary. At the same time, making the correct choice is often critical to the coherence of the output text. In the context of statistical machine translation, this difficulty is enhanced due to the possible long distance between the preposition and the head it modifies, as opposed to the local nature of standard language models. In this work we use monolingual language resources to determine the set of prepositions that are most likely to occur with each verb. We use this information in a transfer-based Arabic-to-Hebrew statistical machine translation system. We show that incorporating linguistic knowledge on the distribution of prepositions significantly improves the translation quality.