A novel divide-and-merge classification for high dimensional datasets

  • Authors:
  • Minseok Seo;Sejong Oh

  • Affiliations:
  • Department of Nanobiomedical Science and WCU Research Center of Nanobiomedical Science, Dankook University, Anseodong, Cheonan 330-714, South Korea;Department of Nanobiomedical Science and WCU Research Center of Nanobiomedical Science, Dankook University, Anseodong, Cheonan 330-714, South Korea

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

High dimensional datasets contain up to thousands of features, and can result in immense computational costs for classification tasks. Therefore, these datasets need a feature selection step before the classification process. The main idea behind feature selection is to choose a useful subset of features to significantly improve the comprehensibility of a classifier and maximize the performance of a classification algorithm. In this paper, we propose a one-per-class model for high dimensional datasets. In the proposed method, we extract different feature subsets for each class in a dataset and apply the classification process on the multiple feature subsets. Finally, we merge the prediction results of the feature subsets and determine the final class label of an unknown instance data. The originality of the proposed model is to use appropriate feature subsets for each class. To show the usefulness of the proposed approach, we have developed an application method following the proposed model. From our results, we confirm that our method produces higher classification accuracy than previous novel feature selection and classification methods.