You'll take the high road and I'll take the low road: using a third language to improve bilingual word alignment

Authors:
Lars Borin
Affiliations:
Uppsala University, Uppsala, Sweden
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Year:
2000

Citing 0
Cited 7

Anchor text mining for translation of Web queries: A transitive translation approach

ACM Transactions on Information Systems (TOIS)
A transitive model for extracting translation equivalents of web queries through anchor text mining

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Alignment and extraction of bilingual legal terminology from context profiles

COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Some considerations on guidelines for bilingual alignment and terminology extraction

SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Creating multilingual translation lexicons with regional variations using web corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Pivot language approach for phrase-based statistical machine translation

Machine Translation
Hybrid algorithm for word-level alignment of parallel texts

NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

While language-independent sentence alignment programs typically achieve a recall in the 90 percent range, the same cannot be said about word alignment systems, where normal recall figures tend to fall somewhere between 20 and 40 percent, in the language-independent case. As words (and phrases) for various reasons are more interesting to align than sentences, we need methods to increase word alignment recall, preferably without sacrificing precision. This paper reports on a series of experiments with pivot alignment, which is the use of one or more additional languages to improve bilingual word alignment. The conclusion is that in a multilingual parallel corpus, pivot alignment is a safe way to increase word alignment recall without lowering the precision.