Efficient C4.5

  • Authors:
  • Salvatore Ruggieri

  • Affiliations:
  • -

  • Venue:
  • Efficient C4.5
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an analytic evaluation of the run-time behavior of the C4.5 algorithm which highlights some efficiency improvements. We have implemented a more efficient version of the algorithm, called EC4.5, that improves on C4.5 by adopting the best among three strategies at each node construction. The first strategy uses a binary search of thresholds instead of the linear search of C4.5. The second strategy adopts a counting sort method instead of the quicksort of C4.5. The third strategy uses a main-memory version of the RainForest algorithm for constructing decision trees. Our implementation computes the same decision trees as C4.5 with a performance gain of up to 5 times.