The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Hi-index | 0.00 |
The medical thesaurus, MeSH, has been used to index medical documents. A Korean MeSH also has been developed, but it does not include many of the synonymous translations for the English terms. The coverage of synonymous translation is important to index medical documents correctly. In this paper, we propose an approximate phrase match method to extract synonymous translations from Korean medical documents, where parentheses are used to include English terms, or English keywords are used in the keyword field. The approximate phrase match is to handle the unregistered terms in a bilingual dictionary. The empirical evaluation showed that the proposed methods are very effective to compile translation phrase pairs.