Genetic learner: Discretization and fuzzification of numerical attributes

Authors:
Ivan Bruha;Pavel Kralik;Petr Berka
Affiliations:
McMaster University, Department Computing & Software, Hamilton, Ont., Canada L8S 4L7. E-mail: bruha@mcmaster.ca/ URL: http://www.cas.mcmaster.ca/~bruha;Technical University of Brno, Department Automation and Information Technology, Technicka 2, Brno, CZ-61669, Czech Republic. E-mail: kralik@vertigo.fme.vutbr.cz;Prague University of Economics, Laboratory of Intelligent Systems, Prague, CZ-13067, Czech Republic. E-mail: berka@vse.cz
Venue:
Intelligent Data Analysis
Year:
2000

Citing 15
Cited 0

Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
CSM: A Computational Model of Cumulative Learning

Machine Learning - Special issue on genetic algorithms
Rule induction with CN2: some recent improvements

EWSL-91 Proceedings of the European working session on learning on Machine learning
On changing continuous attributes into ordered discrete attributes

EWSL-91 Proceedings of the European working session on learning on Machine learning
Adaptation in natural and artificial systems

Adaptation in natural and artificial systems
Using Genetic Algorithms for Concept Learning

Machine Learning - Special issue on genetic algorithms
A Knowledge-Intensive Genetic Algorithm for Supervised Learning

Machine Learning - Special issue on genetic algorithms
Exploring the Power of Genetic Search in Learning Symbolic Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Learning Sequential Decision Rules Using Simulation Models and Competition

Machine Learning - Special issue on genetic algorithms
The CN2 Induction Algorithm

Machine Learning
SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts

ECML '93 Proceedings of the European Conference on Machine Learning
Analysis of Genetic Algorithms Evolution under Pure Selection

Proceedings of the 6th International Conference on Genetic Algorithms
Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Journal of Artificial Intelligence Research
Hybrid learning using genetic algorithms and decision trees for pattern classification

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine learning (ML) is a useful and productive component of data mining (DM). Given a large database, a learning algorithm induces a description of concepts (classes) which are immersed in a given problem area. The induction itself consists in searching usually a huge space of possible concept descriptions. There exist several paradigms for controlling this search. One of the promising and efficient paradigms are {\it genetic algorithms} (GAs). There have been done many research projects of incorporating genetic algorithms into the field of machine learning. This paper describes an efficient application of a GA in the attribute-based rule-inducing learning algorithm. Actually, a domain-independent GA has been integrated into the covering learning algorithm CN4, a large extension of the well-known algorithm CN2; the induction procedure of CN4 (beam search methodology) has been removed and the GA has been implanted into this shell. Genetic algorithms are capable of processing symbolic attributes in a simple, natural manner. The processing of numerical (continuous) attributes by genetic algorithms is not so straightforward. One feasible strategy is to discretize numerical attributes before a generic algorithm is called. There exist quite a few discretization preprocessors in data mining and machine learning. This paper describes a newer preprocessor for discretization (categorization) of numerical attributes. The genuine discretization procedures generate sharp bounds (thresholds) between intervals. It may result in capturing training objects from various classes (concepts) into one interval that will not be `pure'; this in particular happens near the interval borders. One feasible way how to eliminate such an impurity around the interval borders is to fuzzify them. The paper first introduces the methodology of our new learning algorithm, the genetic learner. Then the discretization/fuzzification preprocessor is presented. Finally, the paper compares the entire system (a preprocessor and genetic learner) with well-known covering as well as TDIDT learning algorithms.