Unsupervised Feature Selection with Feature Clustering
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
Appropriate feature selection is a very crucial issue in any machine learning framework, specially in Maximum Entropy (ME). In this paper, the selection of appropriate features for constructing a ME based Named Entity Recognition (NER) system is posed as a multiobjective optimization (MOO) problem. Two classification quality measures, namely recall and precision are simultaneously optimized using the search capability of a popular evolutionary MOO technique, NSGA-II. The proposed technique is evaluated to determine suitable feature combinations for NER in two languages, namely Bengali and English that have significantly different characteristics. Evaluation results yield the recall, precision and F-measure values of 70.76%, 81.88% and 75.91%, respectively for Bengali, and 78.38%, 81.27% and 79.80%, respectively for English. Comparison with an existing ME based NER system shows that our proposed feature selection technique is more efficient than the heuristic based feature selection.