C4.5: programs for machine learning
C4.5: programs for machine learning
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Machine Learning
The effect of small disjuncts and class distribution on decision tree learning
The effect of small disjuncts and class distribution on decision tree learning
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A hybrid decision tree/genetic algorithm method for data mining
Information Sciences: an International Journal - Special issue: Soft computing data mining
A Study of Structural and Parametric Learning in XCS
Evolutionary Computation
Bounding XCS's parameters for unbalanced datasets
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Automated global structure extraction for effective local building block processing in XCS
Evolutionary Computation
The effect of imbalanced data sets on LDA: A theoretical and empirical analysis
Pattern Recognition
The class imbalance problem: A systematic study
Intelligent Data Analysis
Evolutionary rule-based systems for imbalanced data sets
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
The Role of Biomedical Dataset in Classification
AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning from imbalanced data in surveillance of nosocomial infection
Artificial Intelligence in Medicine
FSVM-CIL: fuzzy support vector machines for class imbalance learning
IEEE Transactions on Fuzzy Systems - Special section on computing with words
A proposal of evolutionary prototype selection for class imbalance problems
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Imbalanced learning with a biased minimax probability machine
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A ranking-based algorithm for detection of outliers in categorical data
International Journal of Hybrid Intelligent Systems
Applications of Hybrid Extreme Rotation Forests for image segmentation
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
Learning with imbalanced data causes high error-rates. Several approaches have been developed for addressing this problem. In this paper, a new learning model, integrating the C4.5 classifier and evolutionary algorithms, is introduced. To strengthen the model, two separate partitioning data sets are chosen for each original data set, by applying two distinct partitioning schemes proposed in this investigation, and these are used in sequence by the learning model. More specifically, the hybrid system first applies the base method C4.5 to produce a set of rules R from a training set say T_1, as constructed by the first data partitioning scheme. The R is then passed to the Genetic Algorithm to discover another set of rules say R_{GA} from another disjoint training set say T_2. T_2 is decided by the proposed second partitioning method. Finally, some informative rules of R_{GA} are included into R. The presented system is tested on several real data sets collected from the UCI machine learning repository and compared with standard C4.5. Experimental results show the good suitability of the system on imbalanced data sets. However, the model does not show negative performance on balanced data sets too.