An improved CART decision tree for datasets with irrelevant feature

Authors:
Ali Mirza Mahmood;Mohammad Imran;Naganjaneyulu Satuluri;Mrithyumjaya Rao Kuppa;Vemulakonda Rajesh
Affiliations:
Acharya Nagarjuna University, Guntur, Andhra Pradesh, India;Rayalaseema University, Kurnool, Andhra Pradesh, India;Acharya Nagarjuna University, Guntur, Andhra Pradesh, India;Vaagdevi College of Engineering, Warangal, Andhra Pradesh, India;Pursing M.Tech, MIST, Sathupalli, Khamaman District, Andhra Pradesh, India
Venue:
SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part I
Year:
2011

Citing 14
Cited 0

Machine Learning

Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)

Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
A New Partition Criterion for Fuzzy Decision Tree Algorithm

IITA '07 Proceedings of the Workshop on Intelligent Information Technology Application
A hybrid of sequential rules and collaborative filtering for product recommendation

Information Sciences: an International Journal
The Cascade Decision-Tree Improvement Algorithm Based on Unbalanced Data Set

CMC '10 Proceedings of the 2010 International Conference on Communications and Mobile Computing - Volume 01
Decision Trees for Uncertain Data

IEEE Transactions on Knowledge and Data Engineering
Trend discovery in financial time series data using a case based fuzzy decision tree

Expert Systems with Applications: An International Journal
Early Detection of Clinical Parameters in Heart Disease by Improved Decision Tree Algorithm

VCON '10 Proceedings of the 2010 Second Vaagdevi International Conference on Information Technology for Real World Problems
Classification by clustering decision tree-like classifier based on adjusted clusters

Expert Systems with Applications: An International Journal
Classification by clustering decision tree-like classifier based on adjusted clusters

Expert Systems with Applications: An International Journal
Active rule learning using decision tree for resource management in Grid computing

Future Generation Computer Systems
A novel pruning approach using expert knowledge for data-specific pruning

Engineering with Computers
An Efficient Algorithm for Generating Generalized Decision Forests

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data size is moderate or huge. Although numerous algorithms for accuracy improvement have been proposed, all assume that inducing a compact and highly generalized model is difficult. In order to address above said issue, we introduce Randomized Gini Index (RGI), a novel heuristic function for dimensionality reduction, particularly applicable in large scale databases. Apart from removing irrelevant attributes, our algorithm is capable of minimizing the level of noise in the data to a greater extend which is a very attractive feature for data mining problems. We extensively evaluate its performance through experiments on both artificial and real world datasets. The outcome of the study shows the suitability and viability of our approach for knowledge discovery in moderate and large datasets.