Elegant Decision Tree Algorithm for Classification in Data Mining

  • Authors:
  • B. Chandra;Sati Mazumdar;Vincent Arena;N. Parimi

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WISEW '02 Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops) - (WISEw'02)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Decision trees have been found very effective for classification especially in Data Mining. This paper aims atimproving the performance of the SLIQ decision tree algorithm (Mehta et. al,1996) for classification in datamining The drawback of this algorithm is that large number of gini indices have to be computed at each node ofthe decision tree. In order to decide which attribute is to be split at each node, the gini indices have to becomputed for all the attributes and for each successive pair of values for all patterns which have not beenclassified. An improvement over the SLIQ algorithm has been proposed to reduce the computationalcomplexity. In this algorithm, the gini index is computed not for every successive pair of values of an attributebut over different ranges of attribute values. Classification accuracy of this technique was compared with theexisting SLIQ and the Neural Network technique on three real life datasets consisting of the effect of differentchemicals on water pollution, Wisconsin Breast Cancer Data and Image data It was observed that the decisiontree constructed using the proposed decision tree algorithm gave far better classification accuracy than theclassification accuracy obtained using the SLIQ algorithm irrespective of the dataset under consideration. Theclassification accuracy of this algorithm was even better compared to the neural network classificationtechnique. Overall, it was observed that this decision tree algorithm not only reduces the number ofcomputations of gini indices but also leads to better classification accuracy.