The input/output complexity of transitive closure

  • Authors:
  • Jeffrey D. Ullman;Mihalis Yannakakis

  • Affiliations:
  • Stanford University;ATT Bell Laboratories

  • Venue:
  • SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

Suppose a directed graph has its arcs stored in secondary memory, and we wish to compute its transitive closure, also storing the result in secondary memory. We assume that an amount of main memory capable of holding s “values” is available, and that s lies between n, the number of nodes of the graph, and e, the number of arcs. The cost measure we use for algorithms is the I/O complexity of Kung and Hong, where we count 1 every time a value is moved into main memory from secondary memory, or vice versa.In the dense case, where e is close to n2, we show that I/O equal to &Ogr;(n3 / √s) is sufficient to compute the transitive closure of an n-node graph, using main memory of size s. Moreover, it is necessary for any algorithm that is “standard,” in a sense to be defined precisely in the paper. Roughly, “standard” means that paths are constructed only by concatenating arcs and previously discovered paths. This class includes the usual algorithms that work for the generalization of transitive closure to semiring problems. For the sparse case, we show that I/O equal to &Ogr;(n2 √e/s) is sufficient, although the algorithm we propose meets our definition of “standard” only if the underlying graph is acyclic. We also show that &OHgr;(n2 √e/s) is necessary for any standard algorithm in the sparse case. That settles the I/O complexity of the sparse/acyclic case, for standard algorithms. It is unknown whether this complexity can be achieved in the sparse, cyclic case, by a standard algorithm, and it is unknown whether the bound can be beaten by nonstandard algorithms.We then consider a special kind of standard algorithm, in which paths are constructed only by concatenating arcs and old paths, never by concatenating two old paths. This restriction seems essential if we are to take advantage of sparseness. Unfortunately, we show that almost another factor of n I/O is necessary. That is, there is an algorithm in this class using I/O &Ogr;(n3 √e/s) for arbitrary sparse graphs, including cyclic ones. Moreover, every algorithm in the restricted class must use &OHgr;(n3 √e/s/log3 n) I/O, on some cyclic graphs.