Effective search for genetic-based machine learning systems via estimation of distribution algorithms and embedded feature reduction techniques

Authors:
Jiadong Yang;Hua Xu;Peifa Jia
Affiliations:
Jike.com, Beijing 100020, China;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...
Venue:
Neurocomputing
Year:
2013

Citing 32
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Using Genetic Algorithms for Concept Learning

Machine Learning - Special issue on genetic algorithms
A Knowledge-Intensive Genetic Algorithm for Supervised Learning

Machine Learning - Special issue on genetic algorithms
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation

Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation
A Survey of Optimization by Building and Using Probabilistic Models

Computational Optimization and Applications
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Finding Multimodal Solutions Using Restricted Tournament Selection

Proceedings of the 6th International Conference on Genetic Algorithms
Theoretical and Empirical Analysis of ReliefF and RReliefF

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Accuracy-based learning classifier systems: models, analysis and applications to classification tasks

Evolutionary Computation
Analysis and improvement of fitness exploitation in XCS: bounding models, tournament selection, and bilateral accuracy

Evolutionary Computation
Introduction to Evolutionary Computing

Introduction to Evolutionary Computing
An Extended Chi2 Algorithm for Discretization of Real Value Attributes

IEEE Transactions on Knowledge and Data Engineering
Fast rule matching for learning classifier systems via vector instructions

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Automated global structure extraction for effective local building block processing in XCS

Evolutionary Computation
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Classifier fitness based on accuracy

Evolutionary Computation
Handbook of Parametric and Nonparametric Statistical Procedures

Handbook of Parametric and Nonparametric Statistical Procedures
A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Soft Computing - A Fusion of Foundations, Methodologies and Applications
Performance and efficiency of memetic pittsburgh learning classifier systems

Evolutionary Computation
CoXCS: A Coevolutionary Learning Classifier Based on Feature Space Partitioning

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power

Information Sciences: an International Journal
Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions

The Journal of Machine Learning Research
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Ensemble of niching algorithms

Information Sciences: an International Journal
Genetics-based machine learning for rule induction: state of the art, taxonomy, and comparative study

IEEE Transactions on Evolutionary Computation
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
The compact genetic algorithm

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.01

Visualization

Abstract

Genetic-based machine learning (GBML) systems, which employ evolutionary algorithms (EAs) as search mechanisms, evolve rule-based classification models to represent target concepts. Compared to Michigan-style GBML, Pittsburgh-style GBML is expected to achieve more compact solutions. It has been shown that standard recombination operators in EAs do not assure an effective evolutionary search to solve sophisticated problems that contain strong interactions between features. On the other hand, when dealing with real-world classification tasks, irrelevant features not only complicate the problem but also incur unnecessary matchings in GBML systems, which increase the computational cost a lot. To handle the two problems mentioned above in an integrated manner, a new Pittsburgh-style GBML system is proposed. In the proposed method, classifiers are generated and recombined at two levels. At the high level, classifiers are recombined by rule-wise uniform crossover operators since each classifier consists of a variable-size rule set. At the low level, single rules contained in classifiers are reproduced via sampling Bayesian networks that characterize the global statistical information extracted from promising rules found so far. Furthermore, according to the statistical information in the rule population, an embedded approach is presented to detect and remove redundant features incrementally following the evolution of rule population. Results of empirical evaluation show that the proposed method outperforms the original Pittsburgh-style GBML system in terms of classification accuracy while reducing the computational cost. Furthermore, the proposed method is also competitive to other non-evolutionary, highly used machine learning methods. With respect to the performance of feature reduction, the proposed embedded approach is able to deliver solutions with higher classification accuracy when removing the same number of features as other feature reduction techniques do.