Acquiring bilingual lexica from keyword listings

Authors:
Filip Graliński;Krzysztof Jassem;Roman Kurc
Affiliations:
Adam Mickiewicz University, Faculty of Mathematics and Computer Science, Poznań, Poland;Adam Mickiewicz University, Faculty of Mathematics and Computer Science, Poznań, Poland;Wroc law University of Technology, Faculty of Computer Science and Management, Wroc law, Poland
Venue:
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Year:
2009

Citing 4
Cited 1

A statistical approach to machine translation

Computational Linguistics
Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Fast and Accurate Sentence Alignment of Bilingual Corpora

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A systematic comparison of various statistical alignment models

Computational Linguistics

Mining parenthetical translations for polish-english lexica

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a new method for acquiring bilingual dictionaries from on-line text corpora. The method merges rulebased techniques for obtaining dictionaries from structuralised data, such as paper dictionaries (in electronic form) or on-line glossaries, with methods used by aligning tools, such as GIZA. The basic idea is to search for anchor words such as abstract or keywords followed by their equivalents in another language. Text fragments that follow anchor words are likely to supply new entries for bilingual lexica.