Graph Distances in the Data-Stream Model

  • Authors:
  • Joan Feigenbaum;Sampath Kannan;Andrew McGregor;Siddharth Suri;Jian Zhang

  • Affiliations:
  • feigenbaum-joan@cs.yale.edu;kannan@cis.upenn.edu;-;suri@yahoo-inc.com;jz@lsu.edu

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, $\tilde{O}(tn^{1+1/t})$-space, $\tilde{O}(t^2n^{1/t})$-time-per-edge algorithm that constructs a $(2t+1)$-spanner. For $t=\Omega(\log n/{\log\log n})$, the algorithm satisfies the semistreaming space restriction of $O(n\operatorname{polylog}n)$ and has per-edge processing time $O(\operatorname{polylog}n)$. This resolves an open question from [J. Feigenbaum et al., Theoret. Comput. Sci., 348 (2005), pp. 207-216]. (2) Breadth-first-search (BFS) trees: For any even constant $k$, we show that any algorithm that computes the first $k$ layers of a BFS tree from a prescribed node with probability at least $2/3$ requires either greater than $k/2$ passes or $\tilde{\Omega}(n^{1+1/k})$ space. Since constructing BFS trees is an important subroutine in many traditional graph algorithms, this demonstrates the need for new algorithmic techniques when processing graphs in the data-stream model. (3) Graph-distance lower bounds: Any $t$-approximation of the distance between two nodes requires $\Omega(n^{1+1/t})$ space. We also prove lower bounds for determining the length of the shortest cycle and other graph properties. (4) Techniques for decreasing per-edge processing: We discuss two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor.