Decision Trees for Uncertain Data

  • Authors:
  • Smith Tsang;Ben Kao;Kevin Y. Yip;Wai-Shing Ho;Sau Dan Lee

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represented by multiple values forming a probability distribution function (pdf). We discover that the accuracy of a decision tree classifier can be much improved if the whole pdf, rather than a simple statistic, is taken into account. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Since processing pdf's is computationally more costly, we propose a series of pruning techniques that can greatly improve the efficiency of the construction of decision trees.