Adaptive Parallel Sentences Mining from Web Bilingual News Collection
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Processing comparable corpora with Bilingual Suffix Trees
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Extracting parallel sub-sentential fragments from non-parallel corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Statistical machine translation of texts with misspelled words
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
An empirical study on web mining of parallel data
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Paraphrase fragment extraction from monolingual comparable corpora
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Automatic filtering of bilingual corpora for statistical machine translation
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Hi-index | 0.00 |
SMT systems rely on sufficient amount of parallel corpora to train the translation model. This paper investigates possibilities to use word-to-word and phrase-to-phrase translations extracted not only from clean parallel corpora but also from noisy comparable corpora. Translation results for a Chinese to English translation task are given.