Attribute clustering and dimensionality reduction based on in/out degree of attributes in dependency graph

  • Authors:
  • Asit Kumar Das;Jaya Sil;Santanu Phadikar

  • Affiliations:
  • Department of Computer Science and Technology, Bengal Engineering and Science University, Howrah, India;Department of Computer Science and Technology, Bengal Engineering and Science University, Howrah, India;Department of Computer Science and Engineering, West Bengal University of Technology, Kolkata, India

  • Venue:
  • SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to mine useful information from huge datasets development of appropriate tools and techniques are needed to organize and evaluate such data. However, ultra high dimensionality of data poses serious challenges in data mining research. The method proposed in the paper encompasses a new strategy in dimensionality reduction by attribute clustering based on the dependency graph of the attributes. Information gain, an established theory of measuring uncertainty and quantified the information contained in the system, of each attribute is calculated that expresses dependency relationship between the attributes in the graph. The underlying principles able to select the optimum set of attributes, called reduct able to classify the dataset as could be done in presence of all attributes. The rate of dimension reduction of the datasets of UCI repository is measured and compared with existing methods and also the classification accuracy with reduced dataset is calculated by various classifiers to measure the effectiveness of the method.