Formemes in English-Czech deep syntactic MT

Authors:
Ondřej Dušek;Zdeněk Žabokrtský;Martin Popel;Martin Majliš;Michal Novák;David Mareček
Affiliations:
Charles University in Prague, Prague;Charles University in Prague, Prague;Charles University in Prague, Prague;Charles University in Prague, Prague;Charles University in Prague, Prague;Charles University in Prague, Prague
Venue:
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Year:
2012

Citing 12
Cited 1

Foundations of statistical natural language processing

Foundations of statistical natural language processing
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Dependency treelet translation: syntactically informed phrasal SMT

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
SPMT: statistical machine translation with syntactified target language phrases

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
TectoMT: highly modular MT system with tectogrammatics used as transfer layer

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
English-Czech MT in 2008

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Hidden Markov tree model in dependency-based machine translation

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Maximum entropy translation model in dependency-based MT framework

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
TectoMT: modular NLP framework

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Influence of parser choice on dependency-based MT

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Synthesis of czech sentences from tectogrammatical trees

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue

Findings of the 2012 workshop on statistical machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the most notable recent improvements of the TectoMT English-to-Czech translation is a systematic and theoretically supported revision of formemes---the annotation of morpho-syntactic features of content words in deep dependency syntactic structures based on the Prague tectogrammatics theory. Our modifications aim at reducing data sparsity, increasing consistency across languages and widening the usage area of this markup. Formemes can be used not only in MT, but in various other NLP tasks.