Network quantification despite biased labels

Authors:
Lei Tang;Huiji Gao;Huan Liu
Affiliations:
Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ
Venue:
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Year:
2010

Citing 8
Cited 0

Why collective inference improves relational classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Estimating rates of rare events at multiple resolutions

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Quantifying counts and costs via classification

Data Mining and Knowledge Discovery
Relational learning via latent social dimensions

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Quantification and semi-supervised classification methods for handling changes in class distribution

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable learning of collective behavior based on sparse social dimensions

Proceedings of the 18th ACM conference on Information and knowledge management
Counting positives accurately despite inaccurate classification

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing availability of participatory web and social media presents enormous opportunities to study human relations and collective behaviors. Many applications involving decision making want to obtain certain generalized properties about the population in a network, such as the proportion of actors given a category, instead of the category of individuals. While data mining and machine learning researchers have developed many methods for link-based classification or relational learning, most are optimized to classify individual nodes in a network. In order to accurately estimate the prevalence of one class in a network, some quantification method has to be used. In this work, two kinds of approaches are presented: quantification based on classification or quantification based on link analysis. Extensive experiments are conducted on several representative network data, with interesting findings reported concerning efficacy and robustness of different quantification methods, providing insights to further quantify the ebb and flow of online collective behaviors at macro-level.