Word association norms, mutual information, and lexicography
Computational Linguistics
Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Bilingual Sentence Alignment: Balancing Robustness and Accuracy
Machine Translation
EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics - Special issue on using large corpora: I
Bitext maps and alignment via pattern recognition
Computational Linguistics
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Using confidence bands for parallel texts alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Aligning portuguese and chinese parallel texts using confidence bands
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Hi-index | 0.00 |
Extraction of term equivalents is one of the most important tasks for building bilingual dictionaries. Several measures have been proposed to extract translation equivalents from aligned parallel texts. In this paper, we will compare 28 measures of similarity based on the co-occurrence of words in aligned parallel text segments. Parallel texts are aligned using a simple method that extends previous work by Pascale Fung & Kathleen McKeown and Melamed but which, in contrast, does not use statistically unsupported heuristics to filter reliable points.