A statistical approach to machine translation
Computational Linguistics
Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
A systematic comparison of various statistical alignment models
Computational Linguistics
A class-based approach to word alignment
Computational Linguistics
Extensions to HMM-based statistical word alignment models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Semantic maps for word alignment in bilingual parallel corpora
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Hi-index | 0.00 |
The word-aligned bilingual corpus is an important knowledge source for many tasks in NLP especially in machine translation. Among the existing word alignment methods, the unknown word problem, the synonym problem and the global optimization problem are very important factors impacting the recall and precision of alignment results. In this paper, we proposed a word alignment model between Chinese and Japanese which measures similarity in terms of morphological similarity, semantic distance, part of speech and co-occurrence, and matches words by maximum weight matching on bipartite graph. The model can partly solve the problems mentioned above. The model was proved to be effective by experiments. It achieved 80% as F-Score than 72% of GIZA++.