STRIP: stream learning of influence probabilities

Authors:
Konstantin Kutzkov;Albert Bifet;Francesco Bonchi;Aristides Gionis
Affiliations:
IT University, Copenhagen, Denmark;Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain;Aalto University, Espoo, Finland
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 18
Cited 1

The probabilistic communication complexity of set intersection

SIAM Journal on Discrete Mathematics
Randomized algorithms

Randomized algorithms
Min-wise independent permutations

Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
Mining the network value of customers

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Maintaining Stream Statistics over Sliding Windows

SIAM Journal on Computing
Finding Interesting Associations without Support Pruning

IEEE Transactions on Knowledge and Data Engineering
Counting Distinct Elements in a Data Stream

RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Maximizing the spread of influence through a social network

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding frequent items in data streams

Theoretical Computer Science - Special issue on automata, languages and programming
Graphs over time: densification laws, shrinking diameters and possible explanations

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Cost-effective outbreak detection in networks

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Prediction of Information Diffusion Probabilities for Independent Cascade Model

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Learning influence probabilities in social networks

Proceedings of the third ACM international conference on Web search and data mining
Time bounds for selection

Journal of Computer and System Sciences
Scalable influence maximization for prevalent viral marketing in large-scale social networks

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Exponential time improvement for min-wise based algorithms

Information and Computation
A data-based approach to social influence maximization

Proceedings of the VLDB Endowment
The Power of Simple Tabulation Hashing

Journal of the ACM (JACM)

Efficient estimation for high similarities using odd sketches

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Influence-driven diffusion of information is a fundamental process in social networks. Learning the latent variables of such process, i.e., the influence strength along each link, is a central question towards understanding the structure and function of complex networks, modeling information cascades, and developing applications such as viral marketing. Motivated by modern microblogging platforms, such as twitter, in this paper we study the problem of learning influence probabilities in a data-stream scenario, in which the network topology is relatively stable and the challenge of a learning algorithm is to keep up with a continuous stream of tweets using a small amount of time and memory. Our contribution is a number of randomized approximation algorithms, categorized according to the available space (superlinear, linear, and sublinear in the number of nodes n) and according to different models (landmark and sliding window). Among several results, we show that we can learn influence probabilities with one pass over the data, using O(nlog n) space, in both the landmark model and the sliding-window model, and we further show that our algorithm is within a logarithmic factor of optimal. For truly large graphs, when one needs to operate with sublinear space, we show that we can still learn influence probabilities in one pass, assuming that we restrict our attention to the most active users. Our thorough experimental evaluation on large social graph demonstrates that the empirical performance of our algorithms agrees with that predicted by the theory.