Analysis of company growth data using genetic algorithms on binary trees

  • Authors:
  • Gerrit K. Janssens;Kenneth Sösrensen;Arthur Limère;Koen Vanhoof

  • Affiliations:
  • Faculty of Applied Economics, Data Analysis and Modelling Research Group (DAM), Limburg University Centre, Diepenbeek, Belgium;Faculty of Applied Economics, University of Antwerp, Antwerp, Belgium;Faculty of Applied Economics, Financial Management Research Group (FIM), Limburg University Centre, Diepenbeek, Belgium;Faculty of Applied Economics, Data Analysis and Modelling Research Group (DAM), Limburg University Centre, Diepenbeek, Belgium

  • Venue:
  • PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates why some companies grow faster than others, by data mining a survey of a large number of companies in Flanders (the northern part of Belgium). Faster or slower average growth over a time period is explained by building a classification tree containing several categorical variables (both quantitative and qualitative). The technique used – called genAID – splits the population at different levels. It is inspired by the Automatic Interaction Detector (AID) technique to find trees that explain the variability in average growth but uses a genetic algorithm to overcome some of the drawbacks of AID. Classical AID or other tree-growing techniques usually generate a single tree for interpretation. This approach has been criticized because, due to the artifacts of data, spurious interactions may occur. genAID offers the user-analyst a set of trees, which are the best ones found over a number of generations of the genetic algorithm. The user-analyst is then offered the choice of choosing a tree by trading off explanatory power against either the ease of understanding or the conformity with an existing theory.