Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Models of translational equivalence among words
Computational Linguistics
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Learning bilingual collocations by word-level sorting
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Finding structural correspondences from bilingual parsed corpus for corpus-based translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Acquisition of phrase-level bilingual correspondence using dependency structure
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Learning translations of named-entity phrases from parallel corpora
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning sequence-to-sequence correspondences from parallel corpora via sequential pattern mining
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Hi-index | 0.00 |
This paper presents on-going research on automatic extraction of bilingual lexicon from English-Japanese parallel corpora. The main objective of this paper is to examine various N-gram models of generating translation units for bilingual lexicon extraction. Three N-gram models, a baseline model (Bound-length N-gram) and two new models (Chunk-bound N-gram and Dependency-linked N-gram) are compared. An experiment with 10000 English-Japanese parallel sentences shows that Chunk-bound N-gram produces the best result in terms of accuracy (83%) as well as coverage (60%) and it improves approximately by 13% in accuracy and by 5-9% in coverage from the previously proposed baseline model.