Feature selection using misclassification counts

  • Authors:
  • Adil Bagirov;Andrew Yatsko;Andrew Stranieri;Herbert Jelinek

  • Affiliations:
  • University of Ballarat, Ballarat, Victoria, Australia;University of Ballarat, Ballarat, Victoria, Australia;University of Ballarat, Ballarat, Victoria, Australia;University of Ballarat, Ballarat, Victoria, Australia and Charles Sturt University, Albury, New South Wales, Australia

  • Venue:
  • AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance and redundancy, what ranking does not immediately decide. Additionally, feature ranking methods from different independent sources are called in for the direct comparison.