Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
A New Partition Criterion for Fuzzy Decision Tree Algorithm
IITA '07 Proceedings of the Workshop on Intelligent Information Technology Application
A hybrid of sequential rules and collaborative filtering for product recommendation
Information Sciences: an International Journal
The Cascade Decision-Tree Improvement Algorithm Based on Unbalanced Data Set
CMC '10 Proceedings of the 2010 International Conference on Communications and Mobile Computing - Volume 01
Decision Trees for Uncertain Data
IEEE Transactions on Knowledge and Data Engineering
Trend discovery in financial time series data using a case based fuzzy decision tree
Expert Systems with Applications: An International Journal
Early Detection of Clinical Parameters in Heart Disease by Improved Decision Tree Algorithm
VCON '10 Proceedings of the 2010 Second Vaagdevi International Conference on Information Technology for Real World Problems
Classification by clustering decision tree-like classifier based on adjusted clusters
Expert Systems with Applications: An International Journal
Classification by clustering decision tree-like classifier based on adjusted clusters
Expert Systems with Applications: An International Journal
Active rule learning using decision tree for resource management in Grid computing
Future Generation Computer Systems
A novel pruning approach using expert knowledge for data-specific pruning
Engineering with Computers
An Efficient Algorithm for Generating Generalized Decision Forests
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Hi-index | 0.00 |
Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data size is moderate or huge. Although numerous algorithms for accuracy improvement have been proposed, all assume that inducing a compact and highly generalized model is difficult. In order to address above said issue, we introduce Randomized Gini Index (RGI), a novel heuristic function for dimensionality reduction, particularly applicable in large scale databases. Apart from removing irrelevant attributes, our algorithm is capable of minimizing the level of noise in the data to a greater extend which is a very attractive feature for data mining problems. We extensively evaluate its performance through experiments on both artificial and real world datasets. The outcome of the study shows the suitability and viability of our approach for knowledge discovery in moderate and large datasets.