A maximum entropy approach to natural language processing
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Word-sense disambiguation using statistical methods
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Knowledge Extraction from Bilingual Corpora
Information Extraction: Towards Scalable, Adaptable Systems
Structural feature selection for English-Korean statistical machine translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Chinese-Korean word alignment based on linguistic comparison
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Mining bilingual data from the web with adaptively learnt patterns
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Hi-index | 0.00 |
This paper suggests a method to align Korean-English parallel corpus. The structural dissimilarity between Korean and Indo-European languages requires more flexible measures to evaluate the alignment candidates between the bilingual units than is used to handle the pairs of Indo-European languages. The flexible measure is intended to capture the dependency between bilingual items that can occur in different units according to different ordering rules. The proposed method to accomplish Korean English alignment takes phrases as an alignment unit that is a departure from the existing methods taking words as the unit. Phrasal alignment avoids the problem of alignment units and appease the problem of ordering mismatch. The parameters are estimated using the EM algorithm. The proposed alignment algorithm is based on dynamic programming. In the experiments carried out on 253,000 English words and its Korean translations the proposed method achived 68.7% in accuracy at phrase level and 89.2% in accuracy with the bilingual dictionary induced from the alignment. The result of the alignment may lead to richer bilingual data than can be derived from only word level alignments.