Chinese named entity recognition with a hybrid-statistical model

  • Authors:
  • Xiaoyan Zhang;Ting Wang;Jintao Tang;Huiping Zhou;Huowang Chen

  • Affiliations:
  • National Laboratory for Parallel and Distributed Processing, Changsha, Hunan, P.R.China;National Laboratory for Parallel and Distributed Processing, Changsha, Hunan, P.R.China;National Laboratory for Parallel and Distributed Processing, Changsha, Hunan, P.R.China;National Laboratory for Parallel and Distributed Processing, Changsha, Hunan, P.R.China;National Laboratory for Parallel and Distributed Processing, Changsha, Hunan, P.R.China

  • Venue:
  • APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid growth of the available information on the Internet, it is more difficult for us to find the relevant information quickly on the Web. Named Entity Recognition (NER), one of the key techniques in some web information processing tools such as information retrieval and information extraction, has been paid more and more attention. In this paper we address the problem of Chinese NER using a hybrid-statistical model. This study is concentrated on entity names (personal names, location names and organization names), temporal expressions (dates and times) and number expressions. The method is characterized as follows: firstly, NER and Part-of-Speech tagging have been integrated into a unified framework; secondly, it combines Hidden Markov Model (HMM) with Maximum Entropy Model (MEM) by taking MEM as a sub-model invoked in Viterbi algorithm; thirdly, the Part-of-Speech information of the context has been used in MEM. The experiment shows that the hybrid-statistical model could achieve preferable results of Chinese NER, in which the F1 value ranges from 74% to 92% for all kinds of named entities on an open-test data.