#11. Shortest Path

  • Topo sort + relax: O(E + V) time. Only work for DAG
  • Dijkstra, work for graph without negative weight. 可以有环.
  • Bellman-Ford, work for graph without negative cycle, 可以有negative weight, 也可以有环
  • Floyd-Warshall, all sources shortest path algorithm
Algorithm Condition Time Space
Topological Sort + Relaxing DAG O(V) O(V)
Dijkstra Without Negative Edge O((E + V)log(V))
Bellman-Ford Without Negative Cycle O(EV)
Floyd-Warshall All pairs shortest path O(V^3 )

#11.1 Topological Sort to Find Shortest Path for Directed Acyclic Graph (DAG)

Only work for weighed DAG, either positive or negative weight

#Algorithm of Acyclic Shortest Path

  1. Topological Sort
  2. Relaxing based on topological order

#Performance

1
Time: O(E + V)

#Proof

因为是Acyclic, 而且是topological order, 所以each edge is relaxed exactly once.

  1. distTo[w]不可能increase, 因为我们只有当distTo[v] + weight(v, w) < dist[w]的时候才更新dist[w]
  2. distTo[v]不可能再有所改变. 因为是topological order, 当处理到w的时候可以保证已经没有任何未处理的edge再指向v.

Java Acyclic Shortest Path Problem

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public class AcyclicShortestPath {
public static Map<Integer, Integer> acyclicShortestPath(final Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Map<Integer, Integer> dist = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
dist.put(v, 0);
final List<Integer> topologicalOrder = TopologicalSort.topologicalSortDFS(graph);
for (final int v : topologicalOrder) {
for (final int adj : graph.get(v)) {
relax(adj, v, graph.get(v).get(adj), dist);
}
}
return dist;
}
private static void relax(final int v, final int parent, final int weight, Map<Integer, Integer> dist) {
final int updatedWeight = dist.get(parent) + weight;
if (updatedWeight < dist.get(v)) {
dist.put(v, updatedWeight);
}
}
}

Python Acyclic Shortest Path Problem

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#!/usr/bin/python
import collections
import toposort
# Problem: SSSP on a Weighted DAG
# Solution: toplogical sort + relax
# This algorithm assumes:
# 1. the input graph is a DAG
# 2. the input graph has the start and end node.
# 3. every node in the graph is reachable from node
# 4. there is only one node with 0 indegree, which is the starting node
INT_MAX = 2147483647
def shortest_path(graph, start, end):
"""
:type graph: Dict[int, Dict[int, int]]
:type start: int
:type end: int
:rtype: void
"""
if not graph:
raise Exception("invalid graph")
if start not in graph or end not in graph:
raise Exception("invalid input")
ordered = toposort.topo_sort_dfs(graph)
dist, path = collections.defaultdict(lambda: INT_MAX), collections.defaultdict(lambda: INT_MAX)
dist[start] = 0
dfs(graph, start, end, dist, path)
print("shortest length is: %d" % dist[end])
print("shortest path is: %s" % get_path(path, start, end))
def dfs(graph, v, end, dist, path):
if v == end:
return
for adj in graph[v].keys():
if dist[v] + graph[v][adj] < dist[adj]:
dist[adj] = dist[v] + graph[v][adj]
path[adj] = v
dfs(graph, adj, end, dist, path)
def get_path(path, start, end):
res = ''
v = end
while v != start:
res = v + ", " + res
v = path[v]
return res

#11.2 Longest Path Problem

  • For general longest path problem, it’s an NP-Hard problem
  • For DAG, use topological sort to find the longest path

#Algorithm

  1. Negate all weights. (This solution allows negative weights)
  2. Run AcyclicShortestPath Algorithm.
  3. Negate back the results.

#11.2.1 Parallel Job Scheduling Problem

Given a set of jobs with durations and precedence constraints, schedule the jobs by finding the start time for each job, so as to achieve the minimum completion time, while respecting the constraints.

This is a longest path problem

Solution:

  1. Create an edge-weighted DAG.
  2. Two virtual vertices representing:
    • Source: “Before everything start”
    • Destination: “After everything is done”
  3. Each job will create 2 vertices
    • Job start
    • Job end
    • The weight is the duration
    • There will be at least 3 edges for each job (not vertex)
      • Job start –> Job end
      • Source –> Job start
      • Job end –> Destination
  4. Each constraint is an edge.

The longest path of this problem is also called a “critical path”

证明:

  1. Job和Job之间的先后顺序只会产生一条edge, 但不会有weight. 也就是weight是0.
  2. 有weight的都是每一个Job的Duration.

所以比如下图这种情况:

  • 要保证每个Job都完成的话,只能取longest path而不是shortest path.

#11.3 Dijkstra Algorithm

Only work for non-negative weighed directed graph (可以有环).

Dijkstra有两种implementation:

  1. Lazy Implementation
    • not good for dense graph
    • lots of duplicated tuples
  2. Eager Implementation with Indexed Heap

[Notes]

Dijkstra中如果一个node被visit过了, 表示这个node到source的最短距离已经被找到了, 不能improve了, 这也是为什么Dijkstra不能有negative weight的原因.

#Dijkstra vs Eager Prim

They are essentially the same algorithm.

  • Prim’s algorithm picks the closest vertex to the Tree.
  • Dijkstra algorithm picks the closest vertex to the Source.

Both of them are trying to find the spanning tree. (Dijkstra is to find the shortest path from a source to every other vertices.)

  • Prim’s algorithm is for the undirected graph.
  • Dijkstra algorithm is for the directed graph (with no negative weights).

#Performance

1
Time: O(E * log(V)) for IndexedHeap

Java Dijkstra Eager Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public class Dijkastra {
public static Map<Integer, Integer> shortestPath(final Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null || !graph.containsKey(source)) {
throw new IllegalArgumentException();
}
/* Better to verify if the graph only contains non-negative weight */
final IndexHeap<Integer> minIndexHeap = new IndexHeap<>((x, y) -> x.weight - y.weight);
final Map<Integer, Integer> dist = new HashMap<>();
final Map<Integer, Integer> parent = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
minIndexHeap.add(source, 0);
while (!minIndexHeap.isEmpty()) {
final int v = minIndexHeap.poll();
for (final int adj : graph.get(v)) {
relax(adj, v, graph.get(v).get(adj), minHeap, dist, parent);
}
}
return dist;
}
private void relax(final int v, final int parent, final int weight,
final IndexHeap<Integer> minIndexHeap, final Map<Integer, Integer> dist, Map<Integer, Integer> parent) {
final int updatedWeight = dist.get(parent) + weight;
if (updatedWeight < dist.get(v)) {
dist.put(v, updatedWeight);
parent.put(v, parent);
if (minIndexHeap.containsKey(v)) {
minIndexHeap.decreaseKey(v, updatedWeight);
} else {
minIndexHeap.add(v, updatedWeight);
}
}
}
}

Python Dijkstra Lazy Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import heapq
# Problem: SSSP on a weighted directed graph without negative edge
# The algorithm assumes:
# 1. the graph does not have negative edge
# 2. the graph has start and end node
# This is a lazy implementation without updating priority in the heap. The down side of this lazy implementation is
# lots of duplicated tuples in the heap for dense graph.
# There is another eager implementation of dijkstra, which uses an IndexHeap to update the priorities inside the heap.
INT_MAX = 2147483647
def dijkstra(graph, start, end):
"""
:type graph: Dict[int, Dict[int, int]]
:type start: int
:type end: int
:rtype: int, shortest path from start to end
"""
if not graph:
raise Exception("invalid graph")
visited = set()
dist, path = {}
hp = []
# Initialize dist and path
for v in graph.keys():
for w in graph[v].keys():
if v not in dist:
dist[v] = INT_MAX
path[v] = v
if w not in dist:
dist[w] = INT_MAX
path[w] = w
heapq.heappush(hp, (0, start))
dist[start] = 0
while heapq:
curr = heapq.heappop(hp)
distance, v = curr[0], curr[1]
visited.add(v)
if v == end:
return distance
if dist[v] < distance:
continue
for adj in graph[v].keys():
if adj not in visited:
new_dist = distance + graph[v][adj]
if new_dist < dist[adj]:
dist[adj] = new_dist
path[adj] = v
heapq.heappush(hp, (new_dist, adj))
raise Exception("invalid graph")

#11.4 Bellman-Ford Algorithm

Work for Negative Weights on directed graph without negative cycle

#11.4.1 Why Dijkstra doesn’t work for negative edge?

  • 这个slides里的内容很有意思. 先是举例子说明了为什么Dijkstra Algorithm不能在Negative weighted graph上work.
  • 然后可能有人会想出上面Re-weighting的trick. 但是这样的想法是错误的,因为即使这样得到了一个positive weighted graph, 但是shortest path就不是原来那条了!

  • Reference from Princeton’s Algorithm course.

#11.4.2 Algorithm

Relax all edges (in any order) V - 1 times

Bellman-Ford is a Dynamic Programming Algorithm

#11.4.3 Proof: Why V - 1 ?

Check here to see why Bellman-ford algorithm works.

Another proof from Tushar Roy

Relaxing edges的order很大程度上决定了Bellman-Ford algorithm找到shortest path的快慢.

比如Tushar的视频里举了一个例子:

假设有这么一个简单的graph, source是0. 如果按照左边的edge顺序来进行relax, 很显然第一遍iteration只能更新V2的distance. 第二遍iteration只能更新V3的distance. 最后一遍iteration才能找到V4的shortest path.

但是如果edges的relax顺序换成右边的, 那么在第一轮iteration的时候就能找出所有vertices的shortest distance.

所以在worst case的情况下, 需要进行V - 1次relaxing.

#11.4.4 Performance

1
2
3
4
5
6
1. Single Source Shortest Path:
* O(V * E) time
* O(V) space
2. All Sources Shortest Path:
* Apply Bellman-Ford V times. So O(V^2 * E) times.

#11.4.5 Bellman-Ford For All Sources Shortest Path Problem

We can apply Bellman-Ford V times to get all sources shortest distances. (Make every vertex as the single source)

So the time complexity is O(V^2 * E). But What’s the maximum number of edges with V vertices ?

  • 对于每个vertex, 它最多能有V - 1条边, 那么对于整个graph, 最多能有V * (V - 1)条边.
  • 因此Bellman-Ford algorithm在All sources shortest path problem中worst case的时间复杂度是O( V^4 ).
  • Floyd-Warshall algorithm在这个All sources shortest path problem中只需要O(V^3 )

#11.4.6 How to find Negative Cycle

A Shortest path exists iff no negative cycle

  • 因为如果有negative cycle的话, shortest path可以沿着这个cycle无限循环下去, 因为是negative的关系,每绕一圈total weight就更小一点.

Algorithm

上面已经证明了worst case情况下需要进行 V - 1 次relaxing能找到SSSP. 那么我们只需要再多进行一次relax, 如果有vertex的shortest distance有变化(变得更小, 因为relax只有更小才更新), 那么就说明有negative cycle.

Java Bellman-Ford

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
public class BellmanFord {
public void Map<Integer, Integer> shortestPath(Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Map<Integer, Integer> dist = new HashMap<>();
final Map<Integer, Integer> parent = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
dist.put(source, 0);
for (int i = 0; i < graph.size(); i++) {
relax(graph, dist, parent);
}
// One more relax to detect negative cycle
if (hasNegativeCycle(graph, dist)) {
throw new IllegalArgumentException();
}
return dist;
}
private void relax(final Map<Integer, Map<Integer, Integer>> graph, final Map<Integer, Integer> dist, final Map<Integer, Integer> parent) {
for (final int v : graph.keySet()) {
for (final int adj : graph.get(v)) {
final int updatedWeight = dist.get(v) + graph.get(v).get(adj);
if (updatedWeight < dist[adj]) {
dist.put(adj, updatedWeight);
parent.put(adj, v);
}
}
}
}
private boolean hasNegativeCycle(final Map<Integer, Map<Integer, Integer>> graph, final Map<Integer, Integer> dist) {
for (final int v : graph.keySet()) {
for (final int adj : graph.get(v)) {
final int updatedWeight = dist.get(v) + graph.get(v).get(adj);
if (updatedWeight < dist[adj]) {
return true;
}
}
}
return false;
}
}

Python Bellman-Ford

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Problem: SSSP on a weighted directed graph,
# raise exception if negative edge exists
# This algorithm assumes:
# 1. the graph could be any directed graph, with or without negative edges or cycles.
# 2. the graph has start and end node
#
# Why this algorithm works?
# https://www.geeksforgeeks.org/bellman-ford-algorithm-dp-23/
INT_MAX = 2147483647
def bellmanford(graph, start, end):
"""
:type graph: Dict[int, Dict[int, int]]
:type start: int
:type end: int
:rtype: int
"""
if not graph:
raise Exception("invalid graph")
dist = {}
vertices = set()
for v in graph.keys():
for w in graph[v].keys():
vertices.add(v)
vertices.add(w)
dist[v] = INT_MAX
dist[w] = INT_MAX
dist[start] = 0
n = len(vertices)
for _ in range(n - 1):
for v in graph.keys():
for w in graph.keys():
dist[w] = min(dist[w], dist[v] + graph[v][w])
for v in graph.keys():
for w in graph[v].keys():
if dist[v] + graph[v][w] < dist[w]:
raise Exception("negative cycle")
return dist[end]

#11.5 Floyd-Warshall Algorithm

Python Floyd-Warshall Algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# Problem: All pairs shortest path problem
# Solution:
# O(V^3) time Dynamic Programming problem
#
# dp[k][i][j] represents the shortest distance from i to j routing through nodes 0 to k
# dp[k][i][j] = min(dp[k-1][i][j], dp[k-1][i][k] + dp[k-1][k][j])
#
# space is O(V^3) but can be optimized to O(V^2)
# dp[i][j] = min(dp[i][j], dp[i][k]+dp[k][j])
import math
def floyd_warshall(matrix):
"""
:type matrix: List[List[int]], adjacency matrix
If there is no edge between two nodes, the corresponding entry in the matrix is math.inf
:rtype: List[List[int]], all pairs shortest path
"""
if not matrix:
raise Exception("invalid graph")
n = len(matrix)
dp = [[math.inf for j in range(n)] for i in range(n)]
# next[i][j] represents the next node from i if we want to go to j with shortest path
next = [[None for j in range(n)] for i in range(n)]
# Initialization
for i in range(n):
for j in range(n):
dp[i][j] = matrix[i][j]
if matrix[i][j] != math.inf:
next[i][j] = j
# Floyd-Warshall DP
for k in range(n):
for i in range(n):
for j in range(n):
if dp[i][k] + dp[k][j] < dp[i][j]:
dp[i][j] = dp[i][k] + dp[k][j]
next[i][j] = next[i][k]
# Detect Negative Cycle, do DP one more time
for k in range(n):
for i in range(n):
for j in range(n):
if dp[i][k] + dp[k][j] < dp[i][j]:
dp[i][j] = -math.inf
next[i][j] = -1
return dp[n - 1]
def get_path(dp, next, start, end):
"""
:type dp: List[List[int]]
:type next: List[List[int]]
:type start: int
:type end: int
:rtype: List[int]
"""
if dp[start][end] == math.inf:
raise Exception("no path between %s and %s" % (start, end))
elif dp[start][end] == -math.inf:
raise Exception("negative cycle between %s and %s" % (start, end))
path = []
curr = start
while curr != end:
if curr == -1:
raise Exception("negative cycle between %s and %s" % (start, end))
path.append(curr)
curr = next[curr][end]
return path

Comments

01/03/2020