Exploiting contexts to deal with uncertainty in classification

Authors:
Bianca Zadrozny;Gisele L. Pappa;Wagner Meira, Jr.;Marcos André Gonçalves;Leonardo Rocha;Thiago Salles
Affiliations:
Fluminense Fed. Univ., Niterói, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil;Fed. Univ. São João Del Rei, São João Del Rei, Brazil;Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil
Venue:
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
Year:
2009

Citing 9
Cited 1

Cost-Sensitive Learning by Cost-Proportionate Example Weighting

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Predicting good probabilities with supervised learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Hierarchical Density-Based Clustering of Uncertain Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Cleaning disguised missing data: a heuristic approach

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting temporal contexts in text classification

Proceedings of the 17th ACM conference on Information and knowledge management
A Survey of Uncertain Data Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
A Rule-Based Classification Algorithm for Uncertain Data

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Mining frequent itemsets from uncertain data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Uncertain data mining: an example in clustering location data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Inference in possibilistic network classifiers under uncertain observations

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Uncertainty is often inherent to data and still there are just a few data mining algorithms that handle it. In this paper we focus on how to account for uncertainty in classification algorithms, in particular when data attributes should not be considered completely truthful for classifying a given sample. Our starting point is that each piece of data comes from a potentially different context and, by estimating context probabilities of an unknown sample, we may derive a weight that quantifies their influence. We propose a lazy classification strategy that incorporates the uncertainty into both the training and usage of classifiers. We also propose uK-NN, an extension of the traditional K-NN that implements our approach. Finally, we illustrate uK-NN, which is currently being evaluated experimentally, using a document classification toy example.