A novel divide-and-merge classification for high dimensional datasets

Authors:
Minseok Seo;Sejong Oh
Affiliations:
Department of Nanobiomedical Science and WCU Research Center of Nanobiomedical Science, Dankook University, Anseodong, Cheonan 330-714, South Korea;Department of Nanobiomedical Science and WCU Research Center of Nanobiomedical Science, Dankook University, Anseodong, Cheonan 330-714, South Korea
Venue:
Computational Biology and Chemistry
Year:
2013

Citing 15
Cited 0

Minimum Redundancy Feature Selection from Microarray Gene Expression Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Theoretical and Empirical Analysis of ReliefF and RReliefF

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
No Unbiased Estimator of the Variance of K-Fold Cross-Validation

The Journal of Machine Learning Research
Correlation-based Feature Selection Strategy in Neural Classification

ISDA '06 Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications - Volume 01
Invariant optimal feature selection: A distance discriminant and feature ranking based solution

Pattern Recognition
A review of feature selection techniques in bioinformatics

Bioinformatics
Ensemble based on GA wrapper feature selection

Computers and Industrial Engineering
Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research
Error-correcting output codes: a general method for improving multiclass inductive learning programs

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Simultaneous feature selection and classification using kernel-penalized support vector machines

Information Sciences: an International Journal
A new dataset evaluation method based on category overlap

Computers in Biology and Medicine
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Derivation of an artificial gene to improve classification accuracy upon gene selection

Computational Biology and Chemistry
Feature subset selection for support vector machines through discriminative function pruning analysis

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

High dimensional datasets contain up to thousands of features, and can result in immense computational costs for classification tasks. Therefore, these datasets need a feature selection step before the classification process. The main idea behind feature selection is to choose a useful subset of features to significantly improve the comprehensibility of a classifier and maximize the performance of a classification algorithm. In this paper, we propose a one-per-class model for high dimensional datasets. In the proposed method, we extract different feature subsets for each class in a dataset and apply the classification process on the multiple feature subsets. Finally, we merge the prediction results of the feature subsets and determine the final class label of an unknown instance data. The originality of the proposed model is to use appropriate feature subsets for each class. To show the usefulness of the proposed approach, we have developed an application method following the proposed model. From our results, we confirm that our method produces higher classification accuracy than previous novel feature selection and classification methods.