Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information

  • Authors:
  • Xu Sun;Naoaki Okazaki;Jun’ichi Tsujii;Houfeng Wang

  • Affiliations:
  • Peking University;Tohoku University;Microsoft Research Asia;Peking University

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present article describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model and the label encoding with global information. Although the two approaches compete with one another, we find they are also highly complementary. We propose a combination of the two approaches, and we will show the proposed method outperforms all of the existing methods on abbreviation generation datasets. In order to reduce computational complexity of learning non-local information, we further present an online training method, which can arrive the objective optimum with accelerated training speed. We used a Chinese newswire dataset and a English biomedical dataset for experiments. Experiments revealed that the proposed abbreviation generator with non-local information achieved the best results for both the Chinese and English languages.