Combining clues for lexical level aligning using the null hypothesis approach

Authors:
Olivier Kraif;Boxing Chen
Affiliations:
LIDILEM, Université Stendhal, Grenoble, France;LIDILEM, Université Stendhal, Grenoble, France
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 5
Cited 4

Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Combining clues for word alignment

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A cheap and fast way to build useful translation lexicons

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Chinese-Korean word alignment based on linguistic comparison

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics

Two-Stage Hypotheses Generation for Spoken Language Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Regenerating hypotheses for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A comparative study of hypothesis alignment and its improvement for machine translation system combination

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Improving phrase-based statistical translation through combination of word alignments

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various informations can be used to align parallel texts at word level: co-occurrence frequencies, position difference, part-of-speech, graphic resemblance, etc. This paper proposes a simple method to combine these clues in an efficient way. The association score is computed from the probabilities of pairing two units under Null hypothesis, assuming that the association is fortuitous. This approach has been applied to a literary English-French parallel text with good results.