Gene selection and classification using Taguchi chaotic binary particle swarm optimization

Authors:
Li-Yeh Chuang;Cheng-San Yang;Kuo-Chuan Wu;Cheng-Hong Yang
Affiliations:
Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 80041, Taiwan;Department of Plastic Surgery, Chia-Yi Christian Hospital, Chiayi 60002, Taiwan;Department of Computer Science and Information Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan;Department of Network Systems, Toko University, Chiayi 61363, Taiwan and Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 32
Cited 1

An O(n log n) algorithm for the all-nearest-neighbors problem

Discrete & Computational Geometry
Overfitting Avoidance as Bias

Machine Learning
Floating search methods in feature selection

Pattern Recognition Letters
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Induction of Decision Trees

Machine Learning
The particle swarm optimization algorithm: convergence analysis and parameter selection

Information Processing Letters
An introduction to variable and feature selection

The Journal of Machine Learning Research
Overfitting in making comparisons between variable selection methods

The Journal of Machine Learning Research
Hybrid Genetic Algorithms for Feature Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data mining in bioinformatics using Weka

Bioinformatics
Experimental study for the comparison of classifier combination methods

Pattern Recognition
Feature selection based on rough sets and particle swarm optimization

Pattern Recognition Letters
Two-stage classification methods for microarray data

Expert Systems with Applications: An International Journal
A Branch and Bound Algorithm for Feature Subset Selection

IEEE Transactions on Computers
Improved binary PSO for feature selection using gene expression data

Computational Biology and Chemistry
A review of feature selection techniques in bioinformatics

Bioinformatics
A neural network-based approach for dynamic quality prediction in a plastic injection molding process

Expert Systems with Applications: An International Journal
A Novel GA-Taguchi-Based Feature Selection Method

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Dataset complexity in gene expression based cancer classification using ensembles of k-nearest neighbors

Artificial Intelligence in Medicine
Gene selection from microarray data for cancer classification-a machine learning approach

Computational Biology and Chemistry
On optimum choice of k in nearest neighbor classification

Computational Statistics & Data Analysis
An effective refinement strategy for KNN text classifier

Expert Systems with Applications: An International Journal
A hybrid GA/SVM approach for gene selection and classification of microarray data

EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Dimensionality reduction using genetic algorithms

IEEE Transactions on Evolutionary Computation
Hybrid Taguchi-genetic algorithm for global numerical optimization

IEEE Transactions on Evolutionary Computation
An Evolutionary Algorithm Approach to Optimal Ensemble Classifiers for DNA Microarray Data Analysis

IEEE Transactions on Evolutionary Computation
Recursive Fuzzy Granulation for Gene Subsets Extraction and Cancer Classification

IEEE Transactions on Information Technology in Biomedicine
Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
Input feature selection for classification problems

IEEE Transactions on Neural Networks
Using mutual information for selecting features in supervised neural net learning

IEEE Transactions on Neural Networks

A novel forward gene selection algorithm for microarray data

Neurocomputing

Quantified Score

Hi-index	12.05

Visualization

Abstract

The purpose of gene expression analysis is to discriminate between classes of samples, and to predict the relative importance of each gene for sample classification. Microarray data with reference to gene expression profiles have provided some valuable results related to a variety of problems and contributed to advances in clinical medicine. Microarray data characteristically have a high dimension and a small sample size. This makes it difficult for a general classification method to obtain correct data for classification. However, not every gene is potentially relevant for distinguishing the sample class. Thus, in order to analyze gene expression profiles correctly, feature (gene) selection is crucial for the classification process, and an effective gene extraction method is necessary for eliminating irrelevant genes and decreasing the classification error rate. In this paper, correlation-based feature selection (CFS) and the Taguchi chaotic binary particle swarm optimization (TCBPSO) were combined into a hybrid method. The K-nearest neighbor (K-NN) with leave-one-out cross-validation (LOOCV) method served as a classifier for ten gene expression profiles. Experimental results show that this hybrid method effectively simplifies features selection by reducing the number of features needed. The classification error rate obtained by the proposed method had the lowest classification error rate for all of the ten gene expression data set problems tested. For six of the gene expression profile data sets a classification error rate of zero could be reached. The introduced method outperformed five other methods from the literature in terms of classification error rate. It could thus constitute a valuable tool for gene expression analysis in future studies.