Discriminant Analysis and Factorial Multiple Splits in Recursive Partitioning for Data Mining

  • Authors:
  • Francesco Mola;Roberta Siciliano

  • Affiliations:
  • -;-

  • Venue:
  • MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The framework of this paper is supervised statistical learning in data mining. In particular, multiple sets of inputs are used to predict an output on the basis of a training set. A typical data mining problem is to deal with large sets of within-groups correlated inputs compared to the number of observed objects. Standard tree-based procedures offer unstable and not interpretable solutions especially in case of complex relationships. For that multiple splits defined upon a suitable combination of inputs are required. This paper provides a methodology to build up a tree-based model which nodes splitting is due to factorial multiple splitting variables. A recursive partitioning algorithm is introduced considering a two-stage splitting criterion based on linear discriminant functions. As a result, an automated and fast procedure allows to look for factorial multiple splits able to capture suitable directions in the variability among the sets of inputs. Real world applications are discussed and the results of a simulation study are shown to describe fruitful properties of the proposed methodology.