Active learning for constructing transliteration lexicons from the Web

  • Authors:
  • Jin-Shea Kuo;Haizhou Li;Ying-Kuei Yang

  • Affiliations:
  • Dept. of Electrical Engineering, National Taiwan University of Science and Technology, Taiwan;Institute for Infocomm Research, Singapore;Dept. of Electrical Engineering, National Taiwan University of Science and Technology, Taiwan

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article presents an adaptive learning framework for Phonetic Similarity Modeling (PSM) that supports the automatic construction of transliteration lexicons. The learning algorithm starts with minimum prior knowledge about machine transliteration and acquires knowledge iteratively from the Web. We study the unsupervised learning and the active learning strategies that minimize human supervision in terms of data labeling. The learning process refines the PSM and constructs a transliteration lexicon at the same time. We evaluate the proposed PSM and its learning algorithm through a series of systematic experiments, which show that the proposed framework is reliably effective on two independent databases. © 2008 Wiley Periodicals, Inc.