Feature Selection via Set Cover

Authors:
M. Dash
Affiliations:
-
Venue:
KDEX '97 Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop
Year:
1997

Citing 0
Cited 10

Consistency Based Feature Selection

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Feature Selection Using Consistency Measure

DS '99 Proceedings of the Second International Conference on Discovery Science
Consistency-based search in feature selection

Artificial Intelligence
Toward Integrating Feature Selection Algorithms for Classification and Clustering

IEEE Transactions on Knowledge and Data Engineering
Mixed feature selection based on granulation and approximation

Knowledge-Based Systems
Consistency measures for feature selection

Journal of Intelligent Information Systems
Set Cover Feature Selection for Text Categorisation and spam detection

International Journal of Advanced Intelligence Paradigms
Optimizing reservoir features in oil exploration management based on fusion of soft computing

Applied Soft Computing
Attribute selection based on a new conditional entropy for incomplete decision systems

Knowledge-Based Systems
Feature selection filter for classification of power system operating states

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In pattern classification, features are used to define classes. Feature selection is a preprocessing process that searches for an "optimal" subset of features. The class separability is normally used as the basic feature selection criterion. Instead of maximizing the class separability as in the literature, this work adopts a criterion aiming to maintain the discriminating power of the data describing its classes. In other words, the problem is formalized as that of finding the smallest set of features that is ``consistent'' in describing classes. We describe a multivariate measure of feature consistency. The new feature selection algorithm is based on Johnson's algorithm for Set Cover. Johnson's analysis implies that this algorithm runs in polynomial time, and outputs a consistent feature set whose size is within a log factor of the best possible. Our experiments show that its performance in practice is much better than this, and that it outperforms earlier methods using a similar amount of time.