Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation

Authors:
G. Craig Murray;Bonnie J. Dorr;Jimmy Lin;Jan Hajič;Pavel Pecina
Affiliations:
University of Maryland;University of Maryland;University of Maryland;Charles University;Charles University
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 10
Cited 4

Supporting access to large digital oral history archives

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
A systematic comparison of various statistical alignment models

Computational Linguistics
Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Extracting word correspondences from bilingual corpora based on word co-occurrences information

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Extraction of lexical translations from non-aligned corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora

DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval

Artificial Intelligence in Medicine

First experiments searching spontaneous Czech speech

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval test collection for searching spontaneous Czech speech

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Multilingual ontologies for cross-language information extraction and semantic search

ER'11 Proceedings of the 30th international conference on Conceptual modeling
Overview of the CLEF-2006 cross-language speech retrieval track

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thesauri and ontologies provide important value in facilitating access to digital archives by representing underlying principles of organization. Translation of such resources into multiple languages is an important component for providing multilingual access. However, the specificity of vocabulary terms in most ontologies precludes fully-automated machine translation using general-domain lexical resources. In this paper, we present an efficient process for leveraging human translations when constructing domain-specific lexical resources. We evaluate the effectiveness of this process by producing a probabilistic phrase dictionary and translating a thesaurus of 56,000 concepts used to catalogue a large archive of oral histories. Our experiments demonstrate a cost-effective technique for accurate machine translation of large ontologies.