French-english terminology extraction from comparable corpora

Authors:
Béatrice Daille;Emmanuel Morin
Affiliations:
LINA – FRE CNRS 2729, University of Nantes, Nantes Cedex 3, France;LINA – FRE CNRS 2729, University of Nantes, Nantes Cedex 3, France
Venue:
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Year:
2005

Citing 11
Cited 8

Computer Evaluation of Indexing and Text Processing

Journal of the ACM (JACM)
A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Empirical methods for exploiting parallel texts

Empirical methods for exploiting parallel texts
Information Extraction in the Web Era

Information Extraction in the Web Era
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Looking for candidate translational equivalents in specialized, comparable corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
Base Noun Phrase translation using web data and the EM algorithm

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
An approach based on multilingual thesauri and model combination for bilingual lexicon extraction

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
An intelligent terminology database as a pre-processor for statistical machine translation

COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Conceptual structuring through term variations

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18

Finding translations for low-frequency words in comparable corpora

Machine Translation
Extracting Chinese-English Bilingual Core Terminology from Parallel Classified Corpora in Special Domain

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Brains, not brawn: The use of “smart” comparable corpora in bilingual terminology mining

ACM Transactions on Speech and Language Processing (TSLP)
Revisiting context-based projection methods for term-translation spotting in comparable corpora

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Bilingual lexicon extraction from comparable corpora enhanced with parallel corpora

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Bilingual lexicon extraction from comparable corpora as metasearch

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
QAlign: a new method for bilingual lexicon extraction from comparable corpora

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Bilingual lexicon extraction from comparable corpora using label propagation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents a method of extracting bilingual lexica composed of single-word terms (SWTs) and multi-word terms (MWTs) from comparable corpora of a technical domain. First, this method extracts MWTs in each language, and then uses statistical methods to align single words and MWTs by exploiting the term contexts. After explaining the difficulties involved in aligning MWTs and specifying our approach, we show the adopted process for bilingual terminology extraction and the resources used in our experiments. Finally, we evaluate our approach and demonstrate its significance, particularly in relation to non-compositional MWT alignment.