Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Phrase-Based Statistical Machine Translation
KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A comparison of alignment models for statistical machine translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Combining clues for word alignment
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
A syntax-based statistical translation model
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
An unsupervised method for word sense tagging using parallel corpora
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Extensions to HMM-based statistical word alignment models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Experiments in parallel-text based grammar induction
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Improving word alignment quality using morpho-syntactic information
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Phrase linguistic classification and generalization for improving statistical machine translation
ACLstudent '05 Proceedings of the ACL Student Research Workshop
TALP phrase-based statistical translation system for European language pairs
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Hi-index | 0.00 |
This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of POS-tagging, lemmatization or stemming, we explore which linguistic information helps improve alignment error rates. For this, evaluation against a human word alignment reference is performed, aiming at an improved machine translation training scheme which eventually leads to improved SMT performance. Experiments are carried out in a Spanish–English European Parliament Proceedings parallel corpus, both in a large and a small data track. As expected, improvements due to introducing morphosyntactic information are bigger in case of data scarcity, but significant improvement is also achieved in a large data task, meaning that certain linguistic knowledge is relevant even in situations of large data availability.