Clustering and classifying informative attributes using rough set theory

  • Authors:
  • Rudra Kalyan Nayak;Debahuti Mishra;Satyabrata Das;Kailash Shaw;Sashikala Mishra;Ramamani Tripathy

  • Affiliations:
  • ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;Trident Academy of Technology, Bhubaneswar, Odisha, India;Gandhi Engineering College, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communications and Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.