Knowledge intensive word alignment with KNOWA

Authors:
Emanuele Pianta;Luisa Bentivogli
Affiliations:
ITC-irst Via Sommarie, Povo - Trento, Italy;ITC-irst Via Sommarie, Povo - Trento, Italy
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 3
Cited 3

Bilingual Sentence Alignment: Balancing Robustness and Accuracy

Machine Translation
A systematic comparison of various statistical alignment models

Computational Linguistics
Using confidence bands for parallel texts alignment

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics

Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus

Natural Language Engineering
Evaluating cross-language annotation transfer in the MultiSemCor corpus

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Incorporating Linguistic Information to Statistical Word-Level Alignment

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies mostly on information contained in bilingual dictionaries. The performances of KNOWA are compared with those of GIZA++, a state of the art statistics-based alignment algorithm. The two algorithms are evaluated on the EuroCor and MultiSemCor tasks, that is on two English/Italian publicly available parallel corpora. The results of the evaluation show that, given the nature and the size of the available English-Italian parallel corpora, a language-resource-based word aligner such as KNOWA can outperform a fully statistics-based algorithm such as GIZA++.