Extraction of translation unit from Chinese-English parallel corpora

Authors:
Chang Baobao;Pernilla Danielsson;Wolfgang Teubert
Affiliations:
Peking University, Beijing, P. R. China;Birmingham University, Birmingham, United Kingdom;Birmingham University, Birmingham, United Kingdom
Venue:
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Year:
2002

Citing 1
Cited 5

Identifying word correspondence in parallel texts

HLT '91 Proceedings of the workshop on Speech and Natural Language

Improving statistical machine translation using domain bilingual multiword expressions

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Extraction of multi-word expressions from small parallel corpora

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Identification of multi-word expressions by combining multiple linguistic information sources

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extracting terminologically relevant collocations in the translation of chinese monograph

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Extraction of multi-word expressions from small parallel corpora

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

More and more researchers have recognized the potential value of the parallel corpus in the research on Machine Translation and Machine Aided Translation. This paper examines how Chinese English translation units could be extracted from parallel corpus. An iterative algorithm based on degree of word association is proposed to identify the multiword units for Chinese and English. Then the Chinese-English Translation Equivalent Pairs are extracted from the parallel corpus. We also made comparison between different statistical association measurement in this paper.