Combinatorial feature selection problems

Authors:
M. Charikar;V. Guruswami;R. Kumar;S. Rajagopalan;A. Sahai
Affiliations:
-;-;-;-;-
Venue:
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Year:
2000

Citing 0
Cited 5

Property testing of data dimensionality

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Feature selection methods for text classification

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Selection in Taxonomies with Applications to Paleontology

DS '08 Proceedings of the 11th International Conference on Discovery Science
Multi-objective genetic algorithm evaluation in feature selection

EMO'11 Proceedings of the 6th international conference on Evolutionary multi-criterion optimization
Parent assignment is hard for the MDL, AIC, and NML costs

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivated by frequently recurring themes in information retrieval and related disciplines, we define a genre of problems called combinatorial feature selection problems. Given a set S of multidimensional objects, the goal is to select a subset K of relevant dimensions (or features) such that some desired property /spl Pi/ holds for the set S restricted to K. Depending on /spl Pi/, the goal could be to either maximize or minimize the size of the subset K. Several well-studied feature selection problems can be cast in this form. We study the problems in this class derived from several natural and interesting properties /spl Pi/, including variants of the classical p-center problem as well as problems akin to determining the VC-dimension of a set system. Our main contribution is a theoretical framework for studying combinatorial feature selection, providing (in most cases essentially tight) approximation algorithms and hardness results for several instances of these problems.