Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Unit completion for a computer-aided translation typing system
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Automatic construction of English/Chinese parallel corpora
Journal of the American Society for Information Science and Technology
Conceptual analysis of parallel corpus collected from the Web
Journal of the American Society for Information Science and Technology
Comparing different units for query translation in Chinese cross-language information retrieval
Proceedings of the 2nd international conference on Scalable information systems
A fast and accurate method for detecting English-Japanese parallel texts
MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Improved sentence alignment on parallel web pages using a stochastic tree alignment model
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An empirical study on web mining of parallel data
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatic filtering of bilingual corpora for statistical machine translation
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Hi-index | 0.00 |
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CLIR results.