Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

Authors:
Jiang Chen;Jian-Yun Nie
Affiliations:
Université de Montréal, Montreal (Quebec), Canada;Université de Montréal, Montreal (Quebec), Canada
Venue:
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Year:
2000

Citing 8
Cited 8

Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Unit completion for a computer-aided translation typing system

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Combining multiple sources for short query translation in Chinese-English cross-language information retrieval

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Automatic construction of English/Chinese parallel corpora

Journal of the American Society for Information Science and Technology
Conceptual analysis of parallel corpus collected from the Web

Journal of the American Society for Information Science and Technology
Comparing different units for query translation in Chinese cross-language information retrieval

Proceedings of the 2nd international conference on Scalable information systems
A fast and accurate method for detecting English-Japanese parallel texts

MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Improved sentence alignment on parallel web pages using a stochastic tree alignment model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An empirical study on web mining of parallel data

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatic filtering of bilingual corpora for statistical machine translation

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CLIR results.