Discriminative feature analysis and selection for document classification

Authors:
Punya Murthy Chinta;M. Narasimha Murty
Affiliations:
Computer Science and Automation, Indian Institute of Science, Bangalore, India;Computer Science and Automation, Indian Institute of Science, Bangalore, India
Venue:
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Year:
2012

Citing 4
Cited 0

Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research
Introduction to Information Retrieval

Introduction to Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification of a large document collection involves dealing with a huge feature space where each distinct word is a feature. In such an environment, classification is a costly task both in terms of running time and computing resources. Further it will not guarantee optimal results because it is likely to overfit by considering every feature for classification. In such a context, feature selection is inevitable. This work analyses the feature selection methods, explores the relations among them and attempts to find a minimal subset of features which are discriminative for document classification.