Feature Selection for Medical Data Mining: Comparisons of Expert Judgment and Automatic Approaches

Authors:
Tsang-Hsiang Cheng;Chih-Ping Wei;Vincent S. Tseng
Affiliations:
Southern Taiwan University of Technology, Taiwan, R.O.C.;National Tsing Hua University, Taiwan, R.O.C.;National Cheng Kung University, Taiwan, R.O.C.
Venue:
CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Year:
2006

Citing 0
Cited 7

Feature selection for medical dataset using rough set theory

CEA'09 Proceedings of the 3rd WSEAS international conference on Computer engineering and applications
An evolutionary memetic algorithm for rule extraction

Expert Systems with Applications: An International Journal
Fuzzy-rough approaches for mammographic risk analysis

Intelligent Data Analysis - Knowledge Discovery in Bioinformatics
A novel data mining mechanism considering bio-signal and environmental data with applications on asthma monitoring

Computer Methods and Programs in Biomedicine
Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering

Computer Methods and Programs in Biomedicine
Computational intelligence for heart disease diagnosis: A medical knowledge driven approach

Expert Systems with Applications: An International Journal
Application of data mining techniques for detecting asymptomatic carotid artery stenosis

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining refers to the process of automatic extracting previously unknown, valid, and actionable patterns or knowledge from large databases for crucial decision support. Among different data mining technique, classification analysis is widely adopted for healthcare applications for supporting medical diagnostic decisions, improving quality of patient care, etc. If a training dataset contains irrelevant features (i.e., attributes), classification analysis may produce less accurate and less understandable results. Two commonly employed feature selection approaches include use of automatic feature selection mechanisms (i.e., data-driven) or expert judgment (i.e., knowledgedriven). Due to differences in their underlying processes, the two prevailing feature selection approaches may have their unique biases that possibly lead to dissimilar classification effectiveness. In this study, we empirically evaluate the classification effectiveness resulted from the two feature selection approaches on a risk prediction of cardiovascular disease dataset. Our evaluation results suggest that the feature subsets selected domain experts improve the sensitivity of a classifier, while the feature subsets selected by an automatic feature selection mechanism improve the predictive power of a classifier on the majority class (i.e., the specificity in this study).