Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition

  • Authors:
  • Asif Ekbal;Sriparna Saha

  • Affiliations:
  • Indian Institute of Technology, Department of Computer Science and Engineering, Patna, India;Indian Institute of Technology, Department of Computer Science and Engineering, Patna, India

  • Venue:
  • International Journal on Document Analysis and Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the concept of finding an appropriate classifier ensemble for named entity recognition is posed as a multiobjective optimization (MOO) problem. Our underlying assumption is that instead of searching for the best-fitting feature set for a particular classifier, ensembling of several classifiers those are trained using different feature representations could be a more fruitful approach, but it is crucial to determine the appropriate subset of classifiers that are most suitable for the ensemble. We use three heterogenous classifiers namely maximum entropy, conditional random field, and support vector machine in order to build a number of models depending upon the various representations of the available features. The proposed MOO-based ensemble technique is evaluated for three resource-constrained languages, namely Bengali, Hindi, and Telugu. Evaluation results yield the recall, precision, and F-measure values of 92.21, 92.72, and 92.46%, respectively, for Bengali; 97.07, 89.63, and 93.20%, respectively, for Hindi; and 80.79, 93.18, and 86.54%, respectively, for Telugu. We also evaluate our proposed technique with the CoNLL-2003 shared task English data sets that yield the recall, precision, and F-measure values of 89.72, 89.84, and 89.78%, respectively. Experimental results show that the classifier ensemble identified by our proposed MOO-based approach outperforms all the individual classifiers, two different conventional baseline ensembles, and the classifier ensemble identified by a single objective–based approach. In a part of the paper, we formulate the problem of feature selection in any classifier under the MOO framework and show that our proposed classifier ensemble attains superior performance to it.