Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora

Authors:
Hiroshi Echizen-ya;Kenji Araki;Yoshio Momouchi
Affiliations:
Hokkai-Gakuen University, Chuo-ku Sapporo, Japan;Hokkaido University, Kita-ku Sapporo, Japan;Hokkai-Gakuen University, Chuo-ku Sapporo, Japan
Venue:
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Year:
2005

Citing 12
Cited 1

Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Learning translation templates from examples

Information Systems - Special issue on selected papers from 6th annual workshop on information technologies and systems, December 1996, Cleveland, Ohio, USA
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A systematic comparison of various statistical alignment models

Computational Linguistics
Models of translational equivalence among words

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
K-vec: a new approach for aligning parallel texts

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Extracting word correspondences from bilingual corpora based on word co-occurrences information

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Extraction of lexical translations from non-aligned corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Study of practical effectiveness for machine translation using recursive chain-link-type learning

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information

Computational Linguistics

Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new learning method to solve the sparse data problem in automatic extraction of bilingual word pairs from parallel corpora with various languages. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any bilingual resource (e.g., a bilingual dictionary, machine translation systems) beforehand. We call this method Inductive Chain Learning (ICL). The ICL can limit the search scope for the decision of equivalents. Using ICL, the recall in three systems based on similarity measures improved respectively 8.0, 6.1 and 6.0 percentage points. In addition, the recall value of GIZA++ improved 6.6 percentage points using ICL.