Test-cost sensitive classification using greedy algorithm on training data

  • Authors:
  • Chang Wan

  • Affiliations:
  • School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China

  • Venue:
  • ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much work has been done to deal with the test-cost sensitive learning on data with missing values. There is a confliction of efficiency and accuracy among previous strategies. Sequential test strategies have high accuracy but low efficiency because of their sequential property. Some batch strategies have high efficiency but lead to poor performance since they make all decisions at one time using initial information. In this paper, we propose a new test strategy, GTD algorithm, to address this problem. Our algorithm uses training data to judge the benefits brought by an unknown attribute and chooses the most useful unknown attribute each time until there is no rewarding unknown attributes. It is more reasonable to judge the utility of an unknown attribute from the real performance on training data other than from the estimation. Our strategy is meaningful since it has high efficiency (We only use training data so GTD is not sequential) and lower total costs than the previous strategies at the same time. The experiments also prove that our algorithm significantly outperforms previous algorithms especially when there is a high missing rate and large fluctuations of test costs.