Chinese named entity identification using class-based language model

  • Authors:
  • Jian Sun;Jianfeng Gao;Lei Zhang;Ming Zhou;Changning Huang

  • Affiliations:
  • Beijing University of Posts & Telecommunications, China;Microsoft Research Asia;Tsinghua University, China;Microsoft Research Asia;Microsoft Research Asia

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider here the problem of Chinese named entity (NE) identification using statistical language model(LM). In this research, word segmentation and NE identification have been integrated into a unified framework that consists of several class-based language models. We also adopt a hierarchical structure for one of the LMs so that the nested entities in organization names can be identified. The evaluation on a large test set shows consistent improvements. Our experiments further demonstrate the improvement after seamlessly integrating with linguistic heuristic information, cache-based model and NE abbreviation identification.