Empirical study on the performance stability of named entity recognition model across domains

  • Authors:
  • Hong Lei Guo;Li Zhang;Zhong Su

  • Affiliations:
  • IBM China Research Laboratory, Haidian District, Beijing, P.R.C.;IBM China Research Laboratory, Haidian District, Beijing, P.R.C.;IBM China Research Laboratory, Haidian District, Beijing, P.R.C.

  • Venue:
  • EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

When a machine learning-based named entity recognition system is employed in a new domain, its performance usually degrades. In this paper, we provide an empirical study on the impact of training data size and domain information on the performance stability of named entity recognition models. We present an informative sample selection method for building high quality and stable named entity recognition models across domains. Experimental results show that the performance of the named entity recognition model is enhanced significantly after being trained with these informative samples.