Reflect and correct: A misclassification prediction approach to active inference

Authors:
Mustafa Bilgic;Lise Getoor
Affiliations:
University of Maryland at College Park, College Park, MD;University of Maryland at College Park, College Park, MD
Venue:
ACM Transactions on Knowledge Discovery from Data (TKDD)
Year:
2009

Citing 34
Cited 3

Query by committee

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
CiteSeer: an automatic citation indexing system

Proceedings of the third ACM conference on Digital libraries
An Introduction to Variational Methods for Graphical Models

Machine Learning
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Mining knowledge-sharing sites for viral marketing

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Learning probabilistic models of link structure

The Journal of Machine Learning Research
Maximizing the spread of influence through a social network

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Active Sampling for Class Probability Estimation and Ranking

Machine Learning
Why collective inference improves relational classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Diverse ensembles for active learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Markov logic networks

Machine Learning
Feature value acquisition in testing: a sequential batch test algorithm

ICML '06 Proceedings of the 23rd international conference on Machine learning
Graph evolution: Densification and shrinking diameters

ACM Transactions on Knowledge Discovery from Data (TKDD)
The dynamics of viral marketing

ACM Transactions on the Web (TWEB)
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce

Proceedings of the ninth international conference on Electronic commerce
Exploiting Network Structure for Active Inference in Collective Classification

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Effective label acquisition for collective classification

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Pseudolikelihood EM for Within-network Relational Learning

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Cautious inference in collective classification

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
VOILA: efficient feature-value acquisition for classification

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Active learning with statistical models

Journal of Artificial Intelligence Research
Optimal nonmyopic value of information in graphical models: efficient algorithms and theoretical limits

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Discriminative probabilistic models for relational data

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Learning an interactive segmentation system

Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
Online active inference and learning

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Batch Mode Active Learning for Networked Data

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information diffusion, viral marketing, graph-based semi-supervised learning, and collective classification all attempt to model and exploit the relationships among nodes in a network to improve the performance of node labeling algorithms. However, sometimes the advantage of exploiting the relationships can become a disadvantage. Simple models like label propagation and iterative classification can aggravate a misclassification by propagating mistakes in the network, while more complex models that define and optimize a global objective function, such as Markov random fields and graph mincuts, can misclassify a set of nodes jointly. This problem can be mitigated if the classification system is allowed to ask for the correct labels for a few of the nodes during inference. However, determining the optimal set of labels to acquire is intractable under relatively general assumptions, which forces us to resort to approximate and heuristic techniques. We describe three such techniques in this article. The first one is based on directly approximating the value of the objective function of label acquisition and greedily acquiring the label that provides the most improvement. The second technique is a simple technique based on the analogy we draw between viral marketing and label acquisition. Finally, we propose a method, which we refer to as reflect and correct, that can learn and predict when the classification system is likely to make mistakes and suggests acquisitions to correct those mistakes. We empirically show on a variety of synthetic and real-world datasets that the reflect and correct method significantly outperforms the other two techniques, as well as other approaches based on network structural measures such as node degree and network clustering.