On Active Learning for Data Acquisition

  • Authors:
  • Zhiqiang Zheng;Balaji Padmanabhan

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many applications are characterized by having naturallyincomplete data on customers - where data on only somefixed set of local variables is gathered. However, having amore complete picture can help build better models. Thenaïve solution to this problem - acquiring complete datafor all customers - is often impractical due to the costs ofdoing so. A possible alternative is to acquire completedata for "some" customers and to use this to improve themodels built. The data acquisition problem is determininghow many, and which, customers to acquire additionaldata from. In this paper we suggest using active learningbased approaches for the data acquisition problem. Inparticular, we present initial methods for data acquisitionand evaluate these methods experimentally on web usagedata and UCI datasets. Results show that the methodsperform well and indicate that active learning basedmethods for data acquisition can be a promising area fordata mining research.