Mining collective intelligence in diverse groups

Authors:
Guo-Jun Qi;Charu C. Aggarwal;Jiawei Han;Thomas Huang
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL, USA;IBM, Yorktown Heights, IL, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA
Venue:
Proceedings of the 22nd international conference on World Wide Web
Year:
2013

Citing 13
Cited 1

An Introduction to Variational Methods for Graphical Models

Machine Learning
Learning probabilistic models of link structure

The Journal of Machine Learning Research
Truth discovery with multiple conflicting information providers on the web

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining Collective Classification and Link Prediction

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Matrix Factorization Techniques for Recommender Systems

Computer
A framework for semantic link discovery over relational data

Proceedings of the 18th ACM conference on Information and knowledge management
Integrating conflicting data: the role of source dependence

Proceedings of the VLDB Endowment
Corroborating information from disagreeing views

Proceedings of the third ACM international conference on Web search and data mining
Knowing what to believe (when you already know something)

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
CoBayes: bayesian knowledge corroboration with assessors of unknown areas of expertise

Proceedings of the fourth ACM international conference on Web search and data mining
Trust analysis with clustering

Proceedings of the 20th international conference companion on World wide web
Semi-supervised truth discovery

Proceedings of the 20th international conference on World wide web
A Bayesian approach to discovering truth from conflicting sources for data integration

Proceedings of the VLDB Endowment

Challenges and perspectives of innovative digital ecosystems designed to monitor and warn natural disasters in Brazil

Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Collective intelligence, which aggregates the shared information from large crowds, is often negatively impacted by unreliable information sources with the low quality data. This becomes a barrier to the effective use of collective intelligence in a variety of applications. In order to address this issue, we propose a probabilistic model to jointly assess the reliability of sources and find the true data. We observe that different sources are often not independent of each other. Instead, sources are prone to be mutually influenced, which makes them dependent when sharing information with each other. High dependency between sources makes collective intelligence vulnerable to the overuse of redundant (and possibly incorrect) information from the dependent sources. Thus, we reveal the latent group structure among dependent sources, and aggregate the information at the group level rather than from individual sources directly. This can prevent the collective intelligence from being inappropriately dominated by dependent sources. We will also explicitly reveal the reliability of groups, and minimize the negative impacts of unreliable groups. Experimental results on real-world data sets show the effectiveness of the proposed approach with respect to existing algorithms.