Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition

  • Authors:
  • Sriparna Saha;Asif Ekbal

  • Affiliations:
  • -;-

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we pose the classifier ensemble problem under single and multiobjective optimization frameworks, and evaluate it for Named Entity Recognition (NER), an important step in almost all Natural Language Processing (NLP) application areas. We propose the solutions to two different versions of the ensemble problem for each of the optimization frameworks. We hypothesize that the reliability of predictions of each classifier differs among the various output classes. Thus, in an ensemble system it is necessary to find out either the eligible classes for which a classifier is most suitable to vote (i.e., binary vote based ensemble) or to quantify the amount of voting for each class in a particular classifier (i.e., real vote based ensemble). We use seven diverse classifiers, namely Naive Bayes, Decision Tree (DT), Memory Based Learner (MBL), Hidden Markov Model (HMM), Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM) to build a number of models depending upon the various representations of the available features that are identified and selected mostly without using any domain knowledge and/or language specific resources. The proposed technique is evaluated for three resource-constrained languages, namely Bengali, Hindi and Telugu. Results using multiobjective optimization (MOO) based technique yield the overall recall, precision and F-measure values of 94.21%, 94.72% and 94.74%, respectively for Bengali, 99.07%, 90.63% and 94.66%, respectively for Hindi and 82.79%, 95.18% and 88.55%, respectively for Telugu. Results for all the languages show that the proposed MOO based classifier ensemble with real voting attains the performance level which is superior to all the individual classifiers, three baseline ensembles and the corresponding single objective based ensemble.