A systematic comparison of various statistical alignment models
Computational Linguistics
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
High-performance bilingual text alignment using statistical and dictionary information
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
TectoMT: highly modular MT system with tectogrammatics used as transfer layer
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Semantic mapping using automatic word alignment and semantic role labeling
SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Hi-index | 0.00 |
In this paper, we describe differences between a classical word alignment on the surface (word-layer alignment) and an alignment of deep syntactic sentence representations (tectogrammatical alignment). The deep structures we use are dependency trees containing content (autosemantic) words as their nodes. Most of other functional words, such as prepositions, articles, and auxiliary verbs are hidden. We introduce an algorithm which aligns such trees using perceptron-based scoring function. For evaluation purposes, a set of parallel sentences was manually aligned. We show that using statistical word alignment (GIZA++ ) can improve the tectogrammatical alignment. Surprisingly, we also show that the tectogrammatical alignment can be then used to significantly improve the original word alignment.