Mining parenthetical translations for polish-english lexica

  • Authors:
  • Filip Graliński

  • Affiliations:
  • Faculty of Mathematics and Computer Science, Adam Mickiewicz University, Poznań, Poland

  • Venue:
  • CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Documents written in languages other than English sometimes include parenthetical English translations, usually for technical and scientific terminology. Techniques had been developed for extracting such translations (as well as transliterations) from large Chinese text corpora. This paper presents methods for mining parenthetical translation in Polish texts. The main difference between translation mining in Chinese and Polish is that the latter is based on the Latin alphabet and it is more difficult to identify English translations in Polish texts. On the other hand, some parenthetically translated terms are preceded with the abbreviation ”ang.” (=English), a kind of an ”anchor”, allowing for querying a Web search engine for such translations.