Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
An experiment in hybrid dictionary and statistical sentence alignment
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Effect of cross-language IR in bilingual lexicon acquisition from comparable corpora
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Reliable measures for aligning Japanese-English news articles and sentences
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Using bilingual comparable corpora and semi-supervised clustering for topic tracking
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Topic tracking based on bilingual comparable corpora and semisupervised clustering
ACM Transactions on Asian Language Information Processing (TALIP)
Integrating Cross-Language Hierarchies and Its Application to Retrieving Relevant Documents
ACM Transactions on Asian Language Information Processing (TALIP)
Retrieving bilingual verb-noun collocations by integrating cross-language category hierarchies
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Hi-index | 0.00 |
Bilingual news article alignment methods based on multi-lingual information retrieval have been shown to be successful for the automatic production of so-called noisy-parallel corpora. In this paper we compare the use of machine translation (MT) to the commonly used dictionary term lookup (DTL) method for Reuter news article alignment in English and Japanese. The results show the trade-off between improved lexical disambiguation provided by machine translation and extended synonym choice provided by dictionary term lookup and indicate that MT is superior to DTL only at medium and low recall levels. At high recall levels DTL has superior precision.