Approximated measures in construction of decision trees from large databases

  • Authors:
  • Hung Son Nguyen;Sinh Hoa Nguyen

  • Affiliations:
  • Institute of Mathematics, Warsaw University, Banacha 2, Warsaw 02-097, Poland;Polish-Japanese Institute of Information Technology Koszykowa 86, 02-008, Warszawa, Poland

  • Venue:
  • Design and application of hybrid intelligent systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an efficient method for decision tree construction from large data set, which is assumed to be stored in some database server, and to be accessible by SQL queries. We develop a decision tree construction method, which minimizes the total time of data transmission between client and server. Our method, based on divide and conqurer search strategy, minimizes the number of simple queries necessary to search for the best cuts. To make it possible, we develop some, approximate measures, defined on intervals of attribute values, to evaluate the chance that the best cut belongs to the given interval. We propose some applications of the presented approach in discretization and construction of soft decision tree, which is a novel classifier model.