A distance-based branch and bound feature selection algorithm

Authors:
Ari Frank;Dan Geiger;Zohar Yakhini
Affiliations:
Dept. of Computer Science and Engineering, University of California San Diego, La Jolla, Ca., and Computer Science Dept., Technion, Haifa, Israel;Computer Science Dept., Technion, Haifa, Israel;Computer Science Dept., Technion, Haifa, Israel and Agilent Laboratories
Venue:
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Year:
2002

Citing 5
Cited 2

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Class prediction and discovery using gene expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
A Branch and Bound Algorithm for Feature Subset Selection

IEEE Transactions on Computers

U-curve: A branch-and-bound optimization algorithm for U-shaped cost functions on Boolean lattices applied to the feature selection problem

Pattern Recognition
Feature selection in an electric billing database considering attribute inter-dependencies

ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is no known efficient method for selecting k Gaussian features from n which achieve the lowest Bayesian classification error. We show an example of how greedy algorithms faced with this task are led to give results that are not optimal. This motivates us to propose a more robust approach. We present a Branch and Bound algorithm for finding a subset of k independent Gaussian features which minimizes the naive Bayesian classification error. Our algorithm uses additive monotonic distance measures to produce bounds for the Bayesian classification error in order to exclude many feature subsets from evaluation, while still returning an optimal solution. We test our method on synthetic data as well as data obtained from gene expression profiling.