Crawling English-Japanese person-name transliterations from the web

  • Authors:
  • Satoshi Sato

  • Affiliations:
  • Nagoya University, Nagoya, Japan

  • Venue:
  • Proceedings of the 18th international conference on World wide web
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Automatic compilation of lexicon is a dream of lexicon compilers as well as lexicon users. This paper proposes a system that crawls English-Japanese person-name transliterations from the Web, which works a back-end collector for automatic compilation of bilingual person-name lexicon. Our crawler collected 561K transliterations in five months. From them, an English-Japanese person-name lexicon with 406K entries has been compiled by an automatic post processing. This lexicon is much larger than other similar resources including English-Japanese lexicon of HeiNER obtained from Wikipedia.