Clustering and classifying informative attributes using rough set theory

Authors:
Rudra Kalyan Nayak;Debahuti Mishra;Satyabrata Das;Kailash Shaw;Sashikala Mishra;Ramamani Tripathy
Affiliations:
ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;Trident Academy of Technology, Bhubaneswar, Odisha, India;Gandhi Engineering College, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India
Venue:
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Year:
2012

Citing 9
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Rough Sets: Theoretical Aspects of Reasoning about Data

Rough Sets: Theoretical Aspects of Reasoning about Data
Discretization of Continuous Attributes for Learning Classification Rules

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Rough Set Approach for Generation of Classification Rules of Breast Cancer Data

Informatica
Gene expression network discovery: a pattern based biclustering approach

Proceedings of the 2011 International Conference on Communication, Computing & Security
Rough set approach to sunspot classification problem

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Fuzzy-Rough Sets Assisted Attribute Selection

IEEE Transactions on Fuzzy Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.