Derivation of an artificial gene to improve classification accuracy upon gene selection

Authors:
Minseok Seo;Sejong Oh
Affiliations:
Department of Nanobiomedical Science, Dankook University, Cheonan 330-714, Republic of Korea;Department of Nanobiomedical Science, Dankook University, Cheonan 330-714, Republic of Korea
Venue:
Computational Biology and Chemistry
Year:
2012

Citing 10
Cited 1

Bagging predictors

Machine Learning
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Minimum Redundancy Feature Selection from Microarray Gene Expression Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Theoretical and Empirical Analysis of ReliefF and RReliefF

Machine Learning
No Unbiased Estimator of the Variance of K-Fold Cross-Validation

The Journal of Machine Learning Research
Invariant optimal feature selection: A distance discriminant and feature ranking based solution

Pattern Recognition
SoFoCles: Feature filtering for microarray classification based on Gene Ontology

Journal of Biomedical Informatics
A new dataset evaluation method based on category overlap

Computers in Biology and Medicine
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
RFS: Efficient feature selection method based on R-value

Computers in Biology and Medicine

A novel divide-and-merge classification for high dimensional datasets

Computational Biology and Chemistry

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset.