Enhanced Cancer Recognition System Based on Random Forests Feature Elimination Algorithm

Authors:
Akin Ozcift
Affiliations:
Gaziantep Vocational School of Higher Education, Computer Programming Division, University of Gaziantep, Gaziantep, Turkey
Venue:
Journal of Medical Systems
Year:
2012

Citing 0
Cited 1

Review: Knowledge discovery in medicine: Current issue and future trend

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate classifiers are vital to design precise computer aided diagnosis (CADx) systems. Classification performances of machine learning algorithms are sensitive to the characteristics of data. In this aspect, determining the relevant and discriminative features is a key step to improve performance of CADx. There are various feature extraction methods in the literature. However, there is no universal variable selection algorithm that performs well in every data analysis scheme. Random Forests (RF), an ensemble of trees, is used in classification studies successfully. The success of RF algorithm makes it eligible to be used as kernel of a wrapper feature subset evaluator. We used best first search RF wrapper algorithm to select optimal features of four medical datasets: colon cancer, leukemia cancer, breast cancer and lung cancer. We compared accuracies of 15 widely used classifiers trained with all features versus to extracted features of each dataset. The experimental results demonstrated the efficiency of proposed feature extraction strategy with the increase in most of the classification accuracies of the algorithms.