Incorporating Linguistic Information to Statistical Word-Level Alignment

Authors:
Eduardo Cendejas;Grettel Barceló;Alexander Gelbukh;Grigori Sidorov
Affiliations:
Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico
Venue:
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Year:
2009

Citing 15
Cited 0

Handbook of Natural Language Processing

Handbook of Natural Language Processing
A systematic comparison of various statistical alignment models

Computational Linguistics
Methods and practical issues in evaluating alignment techniques

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
K-vec: a new approach for aligning parallel texts

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Linguistic knowledge in statistical phrase-based word alignment

Natural Language Engineering
Alignment model adaptation for domain-specific word alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Word to word alignment strategies

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Knowledge intensive word alignment with KNOWA

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A discriminative framework for bilingual word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving statistical machine translation using shallow linguistic knowledge

Computer Speech and Language
Improving word alignment using syntactic dependencies

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Improved word alignment with statistics and linguistic heuristics

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Revising the wordnet domains hierarchy: semantics, coverage and balancing

MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources

Quantified Score

Hi-index	0.06

Visualization

Abstract

Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms.