Detection of entity mentions occurring in English and Chinese text

  • Authors:
  • Kadri Hacioglu;Benjamin Douglas;Ying Chen

  • Affiliations:
  • University of Colorado at Boulder;University of Colorado at Boulder;University of Colorado at Boulder

  • Venue:
  • HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe an integrated approach to entity mention detection that yields a monolithic, almost language independent system. It is optimal in the sense that all categorical constraints are simultaneously considered. The system is compact and easy to develop and maintain, since only a single set of features and classifiers are needed to be designed and optimized. It is implemented using one-versus-all support vector machine (SVM) classifiers and a number of feature extractors at several linguistic levels. SVMs are well known for their ability to handle a large set of overlapping features with theoretically sound generalization properties. Data sparsity might be an important issue as a result of a large number of classes and relatively moderate training data size. However, we report results that the integrated system performs as good as a pipelined system that decomposes the problem into a few smaller sub-tasks. We conduct all our experiments using ACE 2004 data, evaluate the systems using ACE metrics and report competitive performance.