An efficient classifier design integrating rough set and set oriented database operations

Authors:
Asit Kumar Das;Jaya Sil
Affiliations:
Bengal Engineering and Science University, Computer Science and Technology, Shibpur, Howrah, West Bengal 711-103, India;Bengal Engineering and Science University, Computer Science and Technology, Shibpur, Howrah, West Bengal 711-103, India
Venue:
Applied Soft Computing
Year:
2011

Citing 16
Cited 3

Inferring decision trees using the minimum description length principle

Information and Computation
Finding interesting rules from large sets of discovered association rules

CIKM '94 Proceedings of the third international conference on Information and knowledge management
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Rough Sets and Data Mining: Analysis of Imprecise Data

Rough Sets and Data Mining: Analysis of Imprecise Data
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
The CN2 Induction Algorithm

Machine Learning
Induction of Decision Trees

Machine Learning
Rough set methods in feature selection and recognition

Pattern Recognition Letters - Special issue: Rough sets, pattern recognition and data mining
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Reduction and axiomization of covering generalized rough sets

Information Sciences: an International Journal
Rough Association Rule Mining in Text Documents for Acquiring Web User Information Needs

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Cloud model-based data attributes reduction for clustering

Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop
An efficient feature ranking measure for text categorization

Proceedings of the 2008 ACM symposium on Applied computing
ChiMerge: discretization of numeric attributes

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence

Attribute reduction for dynamic data sets

Applied Soft Computing
Gene subset selection for cancer classification using statsitical and rough set approach

SEMCCO'12 Proceedings of the Third international conference on Swarm, Evolutionary, and Memetic Computing
Designing of on line intrusion detection system using rough set theory and Q-learning algorithm

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature subset selection and dimensionality reduction of data are fundamental and most explored area of research in machine learning and data mining domains. Rough set theory (RST) constitutes a sound basis for data mining, can be used at different phases of knowledge discovery process. In the paper, by integrating the concept of RST and relational algebra operations, a new attribute reduction algorithm has been presented to select the minimum set of attributes, called reducts, required for classification of data. Firstly, the conditional attributes are partitioned into different groups according to their score, calculated using projection (@P) and division (@?) operations of relational algebra. The groups based on their scores are sorted in ascending order while the first group contains maximum information is uniquely used for generating the reducts. The non-reduct attributes are combined with the elements of the next group and the modified group is considered for computing the reducts. The process continues until all groups are exhausted and thus a final set of reducts is obtained. Then applying decision tree algorithm on each reduct, decision rule sets are generated, which are later pruned by removing the extraneous components. Finally, by involving the concept of probability theory and graph theory minimum number of rules is obtained used for building an efficient classifier.