A systematic comparison of various statistical alignment models
Computational Linguistics
Models of translational equivalence among words
Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Learning a translation lexicon from monolingual corpora
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
Creating multilingual translation lexicons with regional variations using web corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning transliteration lexicons from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Mining new word translations from comparable corpora
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
When is self-training effective for parsing?
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A comparison of different machine transliteration models
Journal of Artificial Intelligence Research
Named entity translation with web mining and transliteration
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mining parenthetical translations for polish-english lexica
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
This paper presents a semi-supervised learning framework for mining Chinese-English lexicons from large amount of Chinese Web pages. The issue is motivated by the observation that many Chinese neologisms are accompanied by their English translations in the form of parenthesis. We classify parenthetical translations into bilingual abbreviations, transliterations, and translations. A frequency-based term recognition approach is applied for extracting bilingual abbreviations. A self-training algorithm is proposed for mining transliteration and translation lexicons. In which, we employ available lexicons in terms of morpheme levels, i.e., phoneme correspondences in transliteration and grapheme (e.g., suffix, stem, and prefix) correspondences in translation. The experimental results verified the effectiveness of our approaches.