# «Abstract. We review how to solve the all-pairs shortest path problem in a non-negatively weighted digraph with n vertices in expected time O(n2 log ...»

On t h e All-Pairs S h o r t e s t P a t h A l g o r i t h m

of Moffat and Takaoka*

Kurt Mehlhorn and Volker Priebe**

Max-Planck-Institut fiir Informatik, Im Stadtwald, 66123 Saarbrficken, Germany

Abstract. We review how to solve the all-pairs shortest path problem

in a non-negatively weighted digraph with n vertices in expected time

O(n2 log n). This bound is shown to hold with high probability for a

wide class of probability distributions on non-negatively weighted di-

graphs. We also prove that for a large class of probability distributions f~(nlog n) time is necessary with high probability to compute shortest path distances with respect to a single source.

1 Introduction Given a complete digraph in which all the edges have non-negative length, we want to compute the shortest path distance between each pair of vertices. This is one of the most basic questions in graph algorithms, since a variety of com- binatorial optimization problems can be expressed in these terms. As far as worst-case complexity is concerned, we can solve an n-vertex problem in time O(n 3) by either Floyd's algorithm [3] or by n calls of Dijkstra's algorithm [2].

Fredman's algorithm [4] uses efficient distance matrix multiplication techniques and results in a running time of O(n3((loglogn)/log n) 1/3) (slightly improved to O(na((log log n)/log n) 1/2) by Wakaoka [14]). Recently, Karger, Koller, and Phillips [8] presented an algorithm that runs in time O(nm* 4-n 2 log n), where m* denotes the number of edges that are a shortest path from their source to their target.

However, worst-case analysis sometimes fails to cover the advantages of algo- rithms that perform well in practice; average-case analysis has turned out to be more appropriate for these purposes. We are not only interested in algorithms with good expected running time but in algorithms that finish their computa- tions within a certain time bound with high probability (and might therefore be called reliable).

Two kinds of probability distributions on non-negatively weighted complete digraphs have been considered in the literature. In the so-called urdform model, the edge lengths are independent, identically distributed random variables. In * This work is supported by the ESPRIT II Basic Research Actions Program of the EC under contract no. 7141 (project ALCOM II) and the BMFT-project "Soft- wareSkonomie und Softwaresicherheit" ITS 9103 ** Research supported by a Graduiertenkolleg graduate fellowship of the Deutsche For- schungsgemeinschaft. E-mail: priebe@mpi-sb.mpg. de 186 the so-called endpoint-independent model, a sequence c~j, 1 _ j n, of n non- negative weights is fixed for each vertex v of K~ arbitrarily. These weights are assigned randomly to the n edges with source v, i.e., a random injective mapping ~% from [1..n] to V is chosen and c~j is made the weight of edge (v, 7r, (j)) for all j, l j n.

Frieze and Grimmett [6] gave an algorithm with O(n 2 log n) expected running time in the uniform model when the common distribution function F of the edge weights satisfies F(0) = 0, F'(0) exists, and F'(0) 0. Under these assumptions, m* = O(n log n) with high probability and so the algorithm of Karger et al, also achieves running time O(n 2 log n) with high probability.

The endpoint-independent model is much more general and therefore harder to analyze. Spira [12] proved an expected time bound of O(n 2(log n)2), which was later improved by Bloniarz [1] to O(n 2 log n log* n). (We use log to denote logarithms to base 2 and In to denote natural logarithms; log* z := 1 for z 2 and log* z := 1 + log* log z for z 2.) In [10] and [11], Moffat and Takaoka describe two algorithms with an expected time bound of O(n 2 log n). The algorithm in [11] is a simplified version of [10]. In this paper, we present an even simpler version of the algorithm in [11] and also correct a small oversight in the analysis given by Moffat and Takaoka.

Moreover, we prove that the running time of the modified version is O(n 2 log n)

The Zi's are dependent as well, therefore, we do not expect the relation E[1-I Zi] = H E[Zi] to hold. However, considering the underlying experiment, we may conjecture that if Z1 is 'large', then it is less likely to occur that Z2 is 'large' as well. The following lemma proves that the Zi's are indeed negatively correlated.

(A proof is given in the appendix.) Lemmal. For any I C { 1,..., m } with [I] = k 1,

where [z]k := z. ( z - 1 )... ( z - k + 1).

Lemma 1 suffices to establish large deviation estimates of the ChernoffHoeffding kind for Z := )-~i Zi. Let X 1,..., X= be random variables that take values in [0, 1] and let X := ~x~,~Xi. Following [13], we call the Xi's 1correlated if for all non-empty I C_-{1,..., n} Note that if the random variables X 1,..., Xn and Y1,..., Ym are both 1-correlated and if the Xi's are independent of the Yj's, then the whole set of random variables X 1,..., Xn, Y1, 9 9 Ym is 1-correlated as well.

L e m m a 2 ([13]). net X be the sum of 1-correlated random variables X l,..., Xn with values in [0, 1]. Ther~ for any c O,

We will also use Azuma's inequality. The following formulation appears in [9].

Let X 1,..., X, t be independent random variables, with Xk taking Lemraa3.

values in a set Ak for each k. Suppose that the function f : H Ak --+ lit satisfies If(z) - f(Y)t c whenever the vectors z and y differ only in a single coordinate.

Let Y be the random variable f ( X 1,..., X, ~ ). Then for any t O,

We use the following terminology for weighted digraphs. For an edge e = (u, v), we call u the source and v the target or endpoint of e. The weight of an edge is denoted by c(e). We will interpret an entry in the adjacency list of a vertex u either as the endpoint v of an edge with source u or as the edge (u, v) itself, as is convenient.

**3 The Algorithm of Moffat and Takaoka**

We will now review the algorithm of Moffat and Takaoka in [11] and their analysis. We are given a complete digraph on n vertices with non-negative edge weights. The algorithm first sorts all adjacency lists in order of increasing weight (total time O(n 2 log n)) and then solves n single source shortest path problems, one for each vertex of G. The single source shortest path problem, say, with source s E V, is solved in two phases. Both phases are variants of Dijkstra's algorithm [2] and only differ in the way the priority queue is handled.

Dijkstra's algorithm labels the vertices in order of increasing distance from the source. We use S to denote the set of labeled vertices and U = V - S to denote the set of unlabeled vertices. Initially, only the source vertex is labeled, i.e., S = {s}. For each labeled vertex v, its exact distance d(v) from the source is known. For the source node s, we have d(s) = 0. For each labeled vertex v, one of its outgoing edges is called its current edge and is denoted ce(v). We maintain the invariaut that all edges preceding the current edge ce(v) in v's (sorted) adjacency list have their endpoint already labeled. Both phases use a priority queue. A priority queue stores a set of pairs (~, k) where k is a real number and is called the key of the pair. We assume that priority queues are implemented as Fibonacci heaps [5]. Fibonacci heaps support the insertion of a new pair (z, k) in constant time and the deletion of a pair with minimum key (delete rain operation) in amortized time O(logp) where p is the number of pairs in the priority queue. They also support an operation decrease key in constant amortized time. A decrease key operation takes a pointer to a node in a Fibonacci heap containing, say, the pair (x, k), and allows the replacement of k by a smaller key k'.

]89 Phase L In Phase I, the additional invariant is maintained that the targets of all current edges are unlabeled. For each vertex u E U, we also maintain a list L(u) of all vertices v E S whose current edge ends in u. The priority queue contains all vertices in U. The key of a vertex u G U is min~6L(u) d(v) + c(v, u). In each iteration of Phase I, the vertex u E U with minimal key value d(u) is selected and is deleted from the priority queue by a delete rain operation. The vertex u is added to S and for each vertex v E (u} U L(u), the current edge ce(v) is advanced to the first edge in v's (sorted) adjacency list whose target is in V - S.

For any v 6 {u} U L(u), let ce(v) = (v,w) be the new current edge of v and denote w's current key by d~o. We add v to L(w), and if d(v) + c(v, w) d~,, we decrease d~0 appropriately. This implies a decrease key operation on the priority queue. By our assumption on the implementation of the priority queue, the cost of an iteration of Phase I is O(log n) plus the number of edges scanned. Phase I ends when IU] becomes n/log n.

Remark: Moffat and Takaoka use a binary heap instead of a Fibonacci heap to realize the priority queue; Fibonacci heaps did not exist at that time. Since a decrease key operation in a binary heap takes logarithmic time, they use a slightly different strategy for Phase I. They keep the vertices in S in the priority queue. The key of vertex v E S is d(v)§ c(ce(v)). In each iteration, the vertex of minimum key, say, vertex v E S, is selected from the heap. Let w be the target of the current edge of v. The current edge ce(z) is advanced for all z E {w} U L(w).

Then w is inserted into the priority queue with key d(w)+c(ce(w)), and for each z E L(w), the key of z is increased (since the weight of the new current edge is greater than the weight of the old current edge of z). Moffat and Takaoka show that the expected cost of an increase key operation is constant in a binary heap.

We believe that the implementation using Fibonacci heaps is slightly simpler since it does not use a non-standard operation on priority queues.

W h a t is the total number of edges scanned in Phase I? In [11], Moffat and Takaoka argue as follows: Let U0 be the set of unlabeled vertices at the end of Phase I. Then ]U01 -- n/logn. Since for every vertex v the endpoints of the edges out of v form a random permutation of V, we should expect to scan about log n edges in each adjacency list during Phase I and hence about n l o g n edges altogether. This argument is incorrect as U0 is determined by the orderings of the adjacency lists and cannot be fixed independently. The following example makes this fact obvious. Assume that all edges out of the source have length one and all other edges have length two. Then Phase I scans n - n/log n edges out of the source vertex and U0 is determined by the last n/log n edges in the adjacency list of the source. Thus the conclusion that one scans about log n edges in each adjacency list is wrong. However, the derived conclusion that the expected total number of scanned edges is O(n log n) is true, as the following argument shows.

Consider Phase I r, the following modification of Phase I (similar to Spira's algorithm [12]). In this modification, the target of a current edge may be labeled and the vertices v C S are kept in a priority queue with keys d(v) + c(ce(v)).

When a vertex v with minimum key is selected, let w be the target of ce(v). If w does not belong to S, then w is added to S and to the priority queue. In any 190 case, ee(v) is advanced to the nezt edge and v's key is increased appropriately.

The modified algorithm finishes Phase I' when IV - S[ = n/logn. Let So be the set of vertices that have been labeled in Phase I'. For every vertex v E So, denote by A(v) the set of edges out of v that have been scanned by the modified algorithm and denote by S(v) the targets of the edges in A(v).

The analysis of the coupon collector's problem implies that E [ ~ IA(v)[] = O(nlogn). Indeed, let X, be the number of edges scanned when IS] = i. Since the targets of the edges are random, we have E[Xi] ~ n/(n - i) and hence E [Ei(n-n/logn Xi] ~--~ E,(_u 1/i ~__n ( l n n 4- 1).

This proves that E [ ~ [A(v)N = O(nlogn).

It turns out that Phase I of the algorithm by Moffat and Takaoka shows basically the same behavior. In fact, at the end of Phase I, their algorithm has labeled exactly the vertices in So, and all the edges in U~ A(v) have been scanned by the Moffat and Takaoka algorithm as well. However, for the purpose of maintaining the invariant, the current edge pointer of each vertex v C So has been advanced to the first vertex in [To in v's adjacency list. For every vertex v E So, let ev and ev be the current edge of v at the end of Phase I in the algorithm by Moffat and P Takaoka and at the end of Phase I r in the algorithm by Spira, respectively, e~ precedes ev in v's adjacency list and the edge ev is the first edge after ev with ' target in Uo. We can imagine scanning the edges after ev only when Phase I r has I terminated. Due to the endpoint-independent distribution of edge weights, the targets of the edges after e~ in the adjacency list form a random permutation of P V - S(v) D_V - So = Uo. This is the setting of the probabilistic experiment in Sect. 2.1, where the number of edges between ev and ev corresponds to Y1. Since IV - S(v)l n and m := I~ol = n/logn, we deduce from (2) in Sect. 2.1 that the expected number of edges between e~ and e~ is O(log n). This completes the analysis of Phase I.

Moreover, whenever the current edge of a vertex is advanced, it is advanced to the next edge having its endpoint in [To. As argued above, for any vertex v E S, the vertices in U0 are distributed randomly in V - S(v), and (2) in Sect. 2.1 shows that whenever ce(v) is advanced in Phase II, it is advanced by an expected number of O(log n) edges. We conclude that the expected cost of one iteration in Phase II is O(log n) and, given that we do k iterations in Phase II, the expected cost of Phase II is o(k log n). Hence, the total expected cost of Phase II is O(nlogn).

The above discussion is summarized ill the following theorem.

** T h e o r e m 1. For endpoint-independent distributions the algorithm of Moffat and Takaoka runs in expected time O(r~ 2 log n).**

We will next prove that the algorithm by Moffat and Takaoka is reliable, i.e., that, with high probability, its running time does not exceed its expectation by more than a constant multiplicative factor.

** T h e o r e m 2. The running time of the all-pairs shortest path algorithm by Moffat and Takaoka is O(n 2 log n) with high probability.**