Trace complexity of network inference

Authors:
Bruno Abrahao;Flavio Chierichetti;Robert Kleinberg;Alessandro Panconesi
Affiliations:
Cornell University, Ithaca, New York, USA;Sapienza University, Rome, Italy;Cornell University, Ithaca, New York, USA;Sapienza University, Rome, Italy
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 10
Cited 0

Elements of information theory

Elements of information theory
On power-law relationships of the Internet topology

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Maximizing the spread of influence through a social network

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Information diffusion through blogspace

Proceedings of the 13th international conference on World Wide Web
Tracking Information Epidemics in Blogspace

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
You are who you know: inferring user profiles in online social networks

Proceedings of the third ACM international conference on Web search and data mining
Inferring networks of diffusion and influence

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Probability: Theory and Examples

Probability: Theory and Examples
Everyone's an influencer: quantifying influence on twitter

Proceedings of the fourth ACM international conference on Web search and data mining
Learning the graph of epidemic cascades

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental question is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient than, existing network inference approaches. Moreover, we prove that our algorithms are nearly optimal, by proving an information-theoretic lower bound on the number of traces that an optimal inference algorithm requires for performing this task in the general case. Given these strong lower bounds, we turn our attention to special cases, such as trees and bounded-degree graphs, and to property recovery tasks, such as reconstructing the degree distribution without inferring the network. We show that these problems require a much smaller (and more realistic) number of traces, making them potentially solvable in practice.