A niching genetic programming-based multi-objective algorithm for hybrid data classification

Authors:
Marconi De Arruda Pereira;Clodoveu Augusto Davis Júnior;Eduardo Gontijo Carrano;João Antônio De Vasconcelos
Affiliations:
-;-;-;-
Venue:
Neurocomputing
Year:
2014

Citing 29
Cited 0

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
C4.5: programs for machine learning

C4.5: programs for machine learning
Democracy in neural nets: voting schemes for classification

Neural Networks
Empirical methods for artificial intelligence

Empirical methods for artificial intelligence
The nature of statistical learning theory

The nature of statistical learning theory
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Data Mining and Knowledge Discovery with Evolutionary Algorithms

Data Mining and Knowledge Discovery with Evolutionary Algorithms
Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support

Data Mining and Knowledge Discovery
Support Vector Machines and the Bayes Rule in Classification

Data Mining and Knowledge Discovery
Induction of Decision Trees

Machine Learning
Neural Learning from Unbalanced Data

Applied Intelligence
Fundamentals of Database Systems, Fourth Edition

Fundamentals of Database Systems, Fourth Edition
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence

Pattern Recognition
Controlling the parallel layer perceptron complexity using a multiobjective learning algorithm

Neural Computing and Applications
IKNN: Informative K-Nearest Neighbor Pattern Classification

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Blind paraunitary equalization

Signal Processing
Mining frequent trajectory patterns in spatial-temporal databases

Information Sciences: an International Journal
Using Bayesian networks with rule extraction to infer the risk of weed infestation in a corn-crop

Engineering Applications of Artificial Intelligence
An order-clique-based approach for mining maximal co-locations

Information Sciences: an International Journal
Troika - An improved stacking schema for classification tasks

Information Sciences: an International Journal
On detecting nonlinear patterns in discriminant problems

Information Sciences: an International Journal
GP-COACH: Genetic Programming-based learning of COmpact and ACcurate fuzzy rule-based classification systems for High-dimensional problems

Information Sciences: an International Journal
ENDER: a statistical framework for boosting decision rules

Data Mining and Knowledge Discovery
A niched genetic programming algorithm for classification rules discovery in geographic databases

SEAL'10 Proceedings of the 8th international conference on Simulated evolution and learning
Using trees to mine multirelational databases

Data Mining and Knowledge Discovery
Data mining with an ant colony optimization algorithm

IEEE Transactions on Evolutionary Computation
Support vector learning for fuzzy rule-based classification systems

IEEE Transactions on Fuzzy Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper introduces a multi-objective algorithm based on genetic programming to extract classification rules in databases composed of hybrid data, i.e., regular (e.g. numerical, logical, and textual) and non-regular (e.g. geographical) attributes. This algorithm employs a niche technique combined with a population archive in order to identify the rules that are more suitable for classifying items amongst classes of a given data set. The algorithm is implemented in such a way that the user can choose the function set that is more adequate for a given application. This feature makes the proposed approach virtually applicable to any kind of data set classification problem. Besides, the classification problem is modeled as a multi-objective one, in which the maximization of the accuracy and the minimization of the classifier complexity are considered as the objective functions. A set of different classification problems, with considerably different data sets and domains, has been considered: wines, patients with hepatitis, incipient faults in power transformers and level of development of cities. In this last data set, some of the attributes are geographical, and they are expressed as points, lines or polygons. The effectiveness of the algorithm has been compared with three other methods, widely employed for classification: Decision Tree (C4.5), Support Vector Machine (SVM) and Radial Basis Function (RBF). Statistical comparisons have been conducted employing one-way ANOVA and Tukey's tests, in order to provide reliable comparison of the methods. The results show that the proposed algorithm achieved better classification effectiveness in all tested instances, what suggests that it is suitable for a considerable range of classification applications.