On Efficient Construction of Decision Trees from Large Databases

  • Authors:
  • Hung Son Nguyen

  • Affiliations:
  • -

  • Venue:
  • RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main taskin decision tree construction algorithms is to find the "best partition" of the set of objects. In this paper, we investigate the problem of optimal binary partition of continuous attribute for large data sets stored in relational databases. The critical for time complexity of algorithms solving this problem is the number of simple SQL queries necessary to construct such partitions. The straightforward approach to optimal partition selection needs at least O(N) queries, where N is the number of pre-assumed partitions of the searching space. We show some properties of optimization measures related to discernibility between objects, that allow to construct the partition very close to optimal using only O(log N) simple queries.