Asking generalized queries with minimum cost

  • Authors:
  • Jun Du;Charles X. Ling

  • Affiliations:
  • Department of Computer Science, The University of Western Ontario, London, Ontario, Canada;Department of Computer Science, The University of Western Ontario, London, Ontario, Canada

  • Venue:
  • PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous works of active learning usually only ask specific queries. A more natural way is to ask generalized queries with don'tcare features. As each of such generalized queries can often represent a set of specific ones, the answers are usually more helpful in speeding up the learning process. However, despite of such advantages of the generalized queries, more expertise (or effort) is usually required for the oracle to provide accurate answers in real-world situations. Therefore, in this paper, we make a more realistic assumption that, the more general a query is, the higher querying cost it causes. This consequently yields a trade-off that, asking generalized queries can speed up the leaning, but usually with high cost; whereas, asking specific queries is much cheaper (with low cost), but the learning process might be slowed down. To resolve this issue, we propose two novel active learning algorithms for two scenarios: one to balance the predictive accuracy and the querying cost; and the other to minimize the total cost of misclassification and querying. We demonstrate that our new methods can significantly outperform the existing active learning algorithms in both of these two scenarios.