Improving the Performance of a NER System by Post-processing, Context Patterns and Voting

  • Authors:
  • Asif Ekbal;Sivaji Bandyopadhyay

  • Affiliations:
  • Department of Computer Science and Engineering, Jadavpur University, Kolkata, India 700032;Department of Computer Science and Engineering, Jadavpur University, Kolkata, India 700032

  • Venue:
  • ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports about the development of a Named Entity Recognition (NER) system in Bengali by combining the outputs of the two classifiers, namely Conditional Random Field (CRF) and Support Vector Machine (SVM). Lexical context patterns, which are generated from an unlabeled corpus of 10 million wordforms in an unsupervised way, have been used as the features of the classifiers in order to improve their performance. We have post-processed the models by considering the second best tag of CRF and class splitting technique of SVM in order to improve the performance. Finally, the classifiers are combined together into a final system using three weighted voting techniques. Experimental results show the effectiveness of the proposed approach with the overall average recall, precision, and f-score values of 91.33%, 88.19%, and 89.73%, respectively.