Lexical morphology in machine translation: a feasibility study
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Translation by machine of complex nominals: getting it right
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Babouk: Focused Web Crawling for Corpus Compilation and Automatic Terminology Extraction
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
The paper deals with the automatic compilation of bilingual dictionary from specialized comparable corpora. We concentrate on a method to automatically extract and to align neoclassical compounds in two languages from comparable corpora. In order to do this, we assume that neoclassical compounds translate compositionally to neoclassical compounds from one language to another. The method covers the two main forms of neoclassical compounds and is split into three steps: extraction, generation, and selection. Our program takes as input a list of aligned neoclassical elements and a bilingual dictionary in two languages. We also align neoclassical compounds by a pivot language approach depending on the hypothesis that the neoclassical element remains stable in meaning across languages. We experiment with four languages: English, French, German, and Spanish using corpora in the domain of renewable energy; we obtain a precision of 96%.