An approach for extracting bilingual terminology from Wikipedia

Authors:
Maike Erdmann;Kotaro Nakayama;Takahiro Hara;Shojiro Nishio
Affiliations:
Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan
Venue:
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Year:
2008

Citing 10
Cited 12

A Technical Word- and Term-Translation Aid Using Noisy Parallel Corpora across Language Groups

Machine Translation
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Reliable measures for aligning Japanese-English news articles and sentences

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Bilingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Automatic generation of Japanese–English bilingual thesauri based on bilingual corpora

Journal of the American Society for Information Science and Technology - Research Articles
A Thesaurus Construction Method from Large ScaleWeb Dictionaries

AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
JMdict: a Japanese-multilingual dictionary

MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
Wikipedia mining for an association web thesaurus construction

WISE'07 Proceedings of the 8th international conference on Web information systems engineering

Quality Evaluation of Search Results by Typicality and Speciality of Terms Extracted from Wikipedia

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Cross-lingual alignment and completion of Wikipedia templates

CLIAWS3 '09 Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies
Improving the extraction of bilingual terminology from Wikipedia

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Mining meaning from Wikipedia

International Journal of Human-Computer Studies
Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering

Information Sciences: an International Journal
A bilingual dictionary extracted from the Wikipedia link structure

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Language-independent context aware query translation using Wikipedia

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Analyzing methods for improving precision of pivot based bilingual dictionaries

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Towards building a multilingual semantic network: identifying interlingual links in Wikipedia

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Using domain-specific and collaborative resources for term translation

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
An open-source toolkit for mining Wikipedia

Artificial Intelligence
Chinese terminology extraction using EM-Based transfer learning method

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the demand of bilingual dictionaries covering domain-specific terminology, research in the field of automatic dictionary extraction has become popular. However, accuracy and coverage of dictionaries created based on bilingual text corpora are often not sufficient for domain-specific terms. Therefore, we present an approach to extracting bilingual dictionaries from the link structure of Wikipedia, a huge scale encyclopedia that contains a vast amount of links between articles in different languages. Our methods analyze not only these interlanguage links but extract even more translation candidates from redirect page and link text information. In an experiment, we proved the advantages of our methods compared to a traditional approach of extracting bilingual terminology from parallel corpora.