Analysis of company growth data using genetic algorithms on binary trees

Authors:
Gerrit K. Janssens;Kenneth Sösrensen;Arthur Limère;Koen Vanhoof
Affiliations:
Faculty of Applied Economics, Data Analysis and Modelling Research Group (DAM), Limburg University Centre, Diepenbeek, Belgium;Faculty of Applied Economics, University of Antwerp, Antwerp, Belgium;Faculty of Applied Economics, Financial Management Research Group (FIM), Limburg University Centre, Diepenbeek, Belgium;Faculty of Applied Economics, Data Analysis and Modelling Research Group (DAM), Limburg University Centre, Diepenbeek, Belgium
Venue:
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2005

Citing 6
Cited 0

Genetic programming (videotape): the movie

Genetic programming (videotape): the movie
C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining

Data mining
Neural Networks for Statistical Modeling

Neural Networks for Statistical Modeling
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Induction of Decision Trees

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates why some companies grow faster than others, by data mining a survey of a large number of companies in Flanders (the northern part of Belgium). Faster or slower average growth over a time period is explained by building a classification tree containing several categorical variables (both quantitative and qualitative). The technique used – called genAID – splits the population at different levels. It is inspired by the Automatic Interaction Detector (AID) technique to find trees that explain the variability in average growth but uses a genetic algorithm to overcome some of the drawbacks of AID. Classical AID or other tree-growing techniques usually generate a single tree for interpretation. This approach has been criticized because, due to the artifacts of data, spurious interactions may occur. genAID offers the user-analyst a set of trees, which are the best ones found over a number of generations of the genetic algorithm. The user-analyst is then offered the choice of choosing a tree by trading off explanatory power against either the ease of understanding or the conformity with an existing theory.