Exploiting contexts to deal with uncertainty in classification

  • Authors:
  • Bianca Zadrozny;Gisele L. Pappa;Wagner Meira, Jr.;Marcos André Gonçalves;Leonardo Rocha;Thiago Salles

  • Affiliations:
  • Fluminense Fed. Univ., Niterói, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. São João Del Rei, São João Del Rei, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil

  • Venue:
  • Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Uncertainty is often inherent to data and still there are just a few data mining algorithms that handle it. In this paper we focus on how to account for uncertainty in classification algorithms, in particular when data attributes should not be considered completely truthful for classifying a given sample. Our starting point is that each piece of data comes from a potentially different context and, by estimating context probabilities of an unknown sample, we may derive a weight that quantifies their influence. We propose a lazy classification strategy that incorporates the uncertainty into both the training and usage of classifiers. We also propose uK-NN, an extension of the traditional K-NN that implements our approach. Finally, we illustrate uK-NN, which is currently being evaluated experimentally, using a document classification toy example.