An approach for extracting bilingual terminology from Wikipedia

  • Authors:
  • Maike Erdmann;Kotaro Nakayama;Takahiro Hara;Shojiro Nishio

  • Affiliations:
  • Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan;Dept. of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan

  • Venue:
  • DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the demand of bilingual dictionaries covering domain-specific terminology, research in the field of automatic dictionary extraction has become popular. However, accuracy and coverage of dictionaries created based on bilingual text corpora are often not sufficient for domain-specific terms. Therefore, we present an approach to extracting bilingual dictionaries from the link structure of Wikipedia, a huge scale encyclopedia that contains a vast amount of links between articles in different languages. Our methods analyze not only these interlanguage links but extract even more translation candidates from redirect page and link text information. In an experiment, we proved the advantages of our methods compared to a traditional approach of extracting bilingual terminology from parallel corpora.