Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
A systematic comparison of various statistical alignment models
Computational Linguistics
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A unigram orientation model for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
CCG supertags in factored statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
The TALP-UPC phrase-based translation system for EACL-WMT 2009
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Hi-index | 0.00 |
This paper describes the 2010 phrase-based statistical machine translation system developed at the TALP Research Center of the UPC in cooperation with BMIC and VMU. In phrase-based SMT, the phrase table is the main tool in translation. It is created extracting phrases from an aligned parallel corpus and then computing translation model scores with them. Performing a collocation segmentation over the source and target corpus before the alignment causes that different and larger phrases are extracted from the same original documents. We performed this segmentation and used the union of this phrase set with the phrase set extracted from the non-segmented corpus to compute the phrase table. We present the configurations considered and also report results obtained with internal and official test sets.