Statistical model for Japanese abbreviations

  • Authors:
  • Norifumi Murayama;Manabu Okumura

  • Affiliations:
  • (Correspd. E-mail: murayam@blogwatcher.co.jp) Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Tokyo, Japan;Precision and Intelligence Laboratory, Tokyo Institute of Technology, Tokyo, Japan

  • Venue:
  • Intelligent Data Analysis - Artificial Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new approach to detect abbreviations given a root expression. The method is based on a statistical model combining two internal models: a generation and a verification model. The statistical model accounts for both the validity of abbreviations as a character sequence generated from a root (as learnt from the collection of abbreviation-root pairs) and their social validity, indicating how they are really used in the world (as obtained from a web search engine). The experimental results showed that our method outperforms traditional template-based methods. Specifically, using co-occurrence in the verification model yielded the best performance in our method.