C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Data Mining and Knowledge Discovery with Evolutionary Algorithms
Data Mining and Knowledge Discovery with Evolutionary Algorithms
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Improving Identification of Difficult Small Classes by Balancing Class Distribution
AIME '01 Proceedings of the 8th Conference on AI in Medicine in Europe: Artificial Intelligence Medicine
Introduction to Evolutionary Computing
Introduction to Evolutionary Computing
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Test Strategies for Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
A greedy classification algorithm based on association rule
Applied Soft Computing
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Application of elitist multi-objective genetic algorithm for classification rule generation
Applied Soft Computing
A memetic algorithm for evolutionary prototype selection: A scaling up approach
Pattern Recognition
A Strategy for Attributes Selection in Cost-Sensitive Decision Trees Induction
CITWORKSHOPS '08 Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology Workshops
Hit Miss Networks with Applications to Instance Selection
The Journal of Machine Learning Research
PRIE: a system for generating rulelists to maximize ROC performance
Data Mining and Knowledge Discovery
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
Evolutionary rule-based systems for imbalanced data sets
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Intrusion detection using fuzzy association rules
Applied Soft Computing
Handbook of Parametric and Nonparametric Statistical Procedures
Handbook of Parametric and Nonparametric Statistical Procedures
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
GP classification under imbalanced data sets: active sub-sampling and AUC approximation
EuroGP'08 Proceedings of the 11th European conference on Genetic programming
Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study
IEEE Transactions on Evolutionary Computation
Cost-sensitive decision trees applied to medical data
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Evolutionary selection of hyperrectangles in nested generalized exemplar learning
Applied Soft Computing
HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part I
Evolutionary-based selection of generalized instances for imbalanced classification
Knowledge-Based Systems
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Instance selection for class imbalanced problems by means of selecting instances more than once
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
A machine-learning approach to negation and speculation detection in clinical texts
Journal of the American Society for Information Science and Technology
A fuzzy-rough sets based compact rule induction method for classifying hybrid data
RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
BRACID: a comprehensive approach to learning rules from imbalanced data
Journal of Intelligent Information Systems
GSVM: An SVM for handling imbalanced accuracy between classes inbi-classification problems
Applied Soft Computing
Hi-index | 0.00 |
Classification in imbalanced domains is a recent challenge in data mining. We refer to imbalanced classification when data presents many examples from one class and few from the other class, and the less representative class is the one which has more interest from the point of view of the learning task. One of the most used techniques to tackle this problem consists in preprocessing the data previously to the learning process. This preprocessing could be done through under-sampling; removing examples, mainly belonging to the majority class; and over-sampling, by means of replicating or generating new minority examples. In this paper, we propose an under-sampling procedure guided by evolutionary algorithms to perform a training set selection for enhancing the decision trees obtained by the C4.5 algorithm and the rule sets obtained by PART rule induction algorithm. The proposal has been compared with other under-sampling and over-sampling techniques and the results indicate that the new approach is very competitive in terms of accuracy when comparing with over-sampling and it outperforms standard under-sampling. Moreover, the obtained models are smaller in terms of number of leaves or rules generated and they can considered more interpretable. The results have been contrasted through non-parametric statistical tests over multiple data sets.