Integrating cross-lingually relevant news articles and monolingual web documents in bilingual lexicon acquisition

Authors:
Takehito Utsuro;Kohei Hino;Mitsuhiro Kida;Seiichi Nakagawa;Satoshi Sato
Affiliations:
Kyoto University, Kyoto, Japan;Toyohashi University of Technology, Toyohashi, Japan;Kyoto University, Kyoto, Japan;Toyohashi University of Technology, Toyohashi, Japan;Kyoto University, Kyoto, Japan
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 5
Cited 6

PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
An IR approach for translating new words from nonparallel, comparable texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Effect of cross-language IR in bilingual lexicon acquisition from comparable corpora

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Base Noun Phrase translation using web data and the EM algorithm

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1

Automatic extraction of bilingual word pairs using inductive chain learning in various languages

Information Processing and Management: an International Journal
Advanced Information Retrieval

Electronic Notes in Theoretical Computer Science (ENTCS)
Automatic acquisition of adjacent information and its effectiveness in extraction of bilingual word pairs from parallel corpora

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Automatic extraction of low frequency bilingual word pairs from parallel corpora with various languages

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Learning method for automatic acquisition of translation knowledge

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Automatic acquisition of basic katakana lexicon from a given corpus

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the framework of bilingual lexicon acquisition from cross-lingually relevant news articles on the Web, it is relatively harder to reliably estimate bilingual term correspondences for low frequency terms. Considering such a situation, this paper proposes to complementarily use much larger monolingual Web documents collected by search engines, as a resource for reliably re-estimating bilingual term correspondences. We experimentally show that, using a sufficient number of monolingual Web documents, it is quite possible to have reliable estimate of bilingual term correspondences for those low frequency terms.