#Graph

总结一下Graph的所有算法

#1. Search and Traversal

Traversal的算法对于Directed和Undirected graph都通用.

Traversal算法的基本原则是maintain一个visited的HashSet. 当前处理到的结点在处理adjacency之前就会被加入visited. 而对于adjacency, 也只有没有被访问过的adj才会继续递归或者加入queue.

对于undirected graph而言, 从v - w之后, 虽然w的adjacency中包含v, 但很显然v已经被访问过了.

#Depth First Search (DFS)

Java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public class DFS {
public static void traversal(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Set<Integer> visited = new HashSet<>();
final List<Integer> res = new ArrayList<>();
for (final int v : graph.keySet()) {
if (!visited.contains((v))) {
dfs(v, graph, visited, res);
}
}
}
private static void dfs(final int v, final Map<Integer, Set<Integer>> graph, final Set<Integer> visited, final List<Integer> res) {
visited.add(v);
res.add(v);
for (final int adj : graph.get(v)) {
if (!visited.contains(adj)) {
dfs(adj, graph, visited, res);
}
}
}
}

#Breadth First Search (BFS)

Java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public class BFS {
public static List<Integer> traversal(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Queue<Integer> queue = new LinkedList<>();
final Set<Integer> visited = new HashSet<>();
final List<Integer> res = new ArrayList<>();
for (final int v : graph.keySet()) {
if (!visited.contains(v)) {
queue.offer(v);
bfs(graph, queue, visited, res);
}
}
return res;
}
private static void bfs(final Map<Integer, Set<Integer>> graph, final Queue<Integer> queue,
final Set<Integer> visited, final List<Integer> res) {
while (!queue.isEmpty()) {
final int v = queue.poll();
res.add(v);
visited.add(v);
for (final int adj : graph.get(v)) {
if (!visited.contains(adj)) {
queue.offer(adj);
}
}
}
}
}

#2. Cycle

#How to detect cycle?

Detect Cycle中Directed graph和Undirected graph算法的区别相对复杂:

主要的问题是在Undirected graph中对于edge v - w, 当由v到w之后, w需要知道自己是由v过来. 否则w会发现v被访问过但是还没有结束,以至于误以为有cycle.

DFS中处理这个问题非常简单, 将v作为一个parent参数传给v的每个adj.

BFS比较复杂:

  • 对于Directed Graph, 两种BFS方法:
    1. 可以用topological sort的方法做.
    2. 也可以记录每个结点的访问情况, 发现重复被访问的结点既是有cycle. (不需要维护parent)
  • 对于Undirected Graph, 只能用记录访问情况的方法.

[注意两点]

  1. BFS在处理Undirected Graph的过程中记录访问情况的方法需要维护一个parent的HashMap以便于让w知道它是由经过v过来的.
  2. BFS中记录访问情况的方法不同DFS, 不需要访问过但还没有结束,只要访问过即说明有cycle.

虽然在一般情况下比较常用DFS来detect cycle. 但是鉴于BFS复杂的情况, 一定要搞清楚用BFS找cycle在各种情况下的不同!

#(DFS) Depth First Search to Detect Cycle

利用递归. 不同于普通遍历, 判断是否有cycle的时候要利用一个HashMap来记录每个结点的状态.

0 - 未访问
1 - 访问结束
-1 - 访问过但还未结束

#Directed Graph
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public static void hasCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
Map<Integer, Integer> visited = new HashMap<>();
for (final int v : graph.keySet()) {
if (!visited.containsKey(v)) {
if (dfsHasCycle(v, graph, visited)) {
return true;
}
}
}
return false;
}
private static boolean dfsHasCycle(final int v, final Map<Integer, Set<Integer>> graph,
final Map<Integer, Integer> visited) {
final isVisited = visited.getOrDefault(v, 0);
if (isVisited == 1) return false;
if (isVisited == -1) return true;
visited.put(v, -1);
for (final int adj : graph.keySet()) {
if (dfsHasCycle(adj, graph, visited)) {
return true;
}
}
visited.put(v, 1);
return false;
}
#Undirected Graph
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public static void hasCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
Map<Integer, Integer> visited = new HashMap<>();
for (final int v : graph.keySet()) {
if (!visited.containsKey(v)) {
if (dfsHasCycle(v, graph, visited, v)) {
return true;
}
}
}
return false;
}
private static boolean dfsHasCycle(final int v, final Map<Integer, Set<Integer>> graph,
final Map<Integer, Integer> visited, final int parent) {
final isVisited = visited.getOrDefault(v, 0);
if (isVisited == 1) return false;
if (isVisited == -1) return true;
visited.put(v, -1);
for (final int adj : graph.keySet()) {
if (adj != parent) { // Only difference with Directed graph version.
if (dfsHasCycle(adj, graph, visited, v)) {
return true;
}
}
}
visited.put(v, 1);
return false;
}
#(BFS) Breadth First Search to Detect Cycle
#Directed Graph

Solution #1: 记录访问情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public static boolean hasCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
// 不再像DFS那样需要-1, 0, 1三种状态. BFS只需要0和1两种,所以用一个HashSet就够了.
final Set<Integer> visited = new HashSet<>();
final Queue<Integer> queue = new LinkedList<>();
for (final int v : graph.keySet()) {
queue.offer(v);
while (!queue.isEmpty()) {
final int curr = queue.poll();
if (visited.contains(curr)) return true;
visited.add(curr);
for (final int adj : graph.get(curr)) {
if (visited.contains(curr)) return true;
queue.offer(adj);
}
}
}
return false;
}

Solution #2: Topological Sort

BFS判断是否有cycle的思路是利用topological sort. 如果topological sort走完之后还有vertices的indegree大于0的话. 就说明是有环.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public static boolean hasCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
// Compute the indegree of all vertices
final Map<Integer, Integer> indgree = new HashMap<>();
for (final int v : graph.keySet()) {
indegree.putIfAbsent(v, 0);
for (final int adj : graph.get(v)) {
indegree.putIfAbsent(adj, 0);
indegree.put(adj, indegree.get(adj) + 1);
}
}
// Enqueue every vertex with 0 indegree
final Queue<Integer> queue = new LinkedList<>();
for (final int v : indegree) {
if (indegree.get(v) == 0) {
queue.offer(v);
}
}
// Topological sort
while (!queue.isEmpyt()) {
final int v = queue.poll();
for (final int adj : graph.get(v)) {
indegree.put(adj, indegree.get(adj) - 1);
if (indegree.get(adj) == 0) {
queue.offer(adj);
}
}
}
// Find out if any vertex still with non-zero indegree
for (final int v : indegree.keySet()) {
if (indegree.get(v) != 0) {
return true;
}
}
return false;
}
#Undirected Graph

Solution: 记录访问情况

需要有个parent的HashMap来记录每个结点是经过谁过来的.

parent只需要是一个link, Map<Integer, Integer>即可, 不需要是一个Map<Integer, Set<Integer>>. 因为即使有多个结点v1, v2指向w, 因为是undirected graph的关系, 完全可以看做是v1指向w, w再指向v2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public static boolean hasCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
// 不再像DFS那样需要-1, 0, 1三种状态. BFS只需要0和1两种,所以用一个HashSet就够了.
final Set<Integer> visited = new HashSet<>();
final Queue<Integer> queue = new LinkedList<>();
final Map<Integer, Integer> parent = new HashMap<>();
for (final int v : graph.keySet()) {
queue.offer(v);
while (!queue.isEmpty()) {
final int curr = queue.poll();
if (visited.contains(curr)) return true;
visited.add(curr);
for (final int adj : graph.get(curr)) {
if (parent.get(curr) == adj) continue;
if (visited.contains(curr)) return true;
// Update parent link
parent.put(adj, curr);
queue.offer(adj);
}
}
}
return false;
}

#3. Topological Sort

Topological sort的概念只针对DAG! 有环的有向图也不行!

#(DFS) Depth First Search For Topological Sort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public static List<Integer> topologicalSortDFS(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final List<Integer> res = new LinkedList<>();
final Map<Integer, Integer> visited = new HashMap<>();
for (final int v : graph.keySet()) {
if (!visited.contains(v)) {
dfs(v, graph, visited, res);
}
}
return res;
}
private static void dfs(final int v, final Map<Integer, Set<Integer>> graph, final Map<Integer, Integer> visited, final List<Integer> res) {
visited.putIfAbsent(v, 0);
final int status = visited.get(v);
if (status == 1) return;
if (status == -1) throw IllegalArgumentException("The given graph is not a DAG");
visited.put(v, -1);
for (final int adj : graph.get(v)) {
dfs(adj, graph, visited, res);
}
visited.put(v, 1);
res.add(0, v);
}

#(BFS) Breadth First Search for Topological Sort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public static List<Integer> topologicalSortBFS(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Map<Integer, Integer> indegrees = new HashMap<>();
for (final int v : graph.keySet()) {
indegrees.putIfAbsent(v, 0);
for (final int adj : graph.get(v)) {
indegrees.putIfAbsent(adj, 0);
indegrees.put(adj, indegrees.get(adj) + 1);
}
}
final Queue<Integer> queue = new LinkedList<>();
for (final int v : graph.keySet()) {
if (indegrees.get(v) == 0) {
queue.offer(v);
}
}
while (!queue.isEmpty()) {
final int front = queue.poll();
res.add(front);
for (final int adj : graph.keySet()) {
indegrees.put(adj, indegreees.get(adj) - 1);
if (indegrees.get(adj) == 0) {
queue.offer(adj);
}
}
}
return res;
}

#4. Connected Component

#Strongly Connected Component

  1. What is connected graph ?

    • A graph is connected if there is a path between each pair of vertices.
  2. What is strongly connected graph ?

    • A directed graph is strongly connected if there is a path between each pair of vertices.
    • Strongly connected的概念只存在于Directed Graph

Strongly connected components in G are the same as they are in reversed G!

#Kosaraju’s Algorithm
  1. Topological sort on reversed G!
  2. Run DFS based on the order from step 1
#Simple Proof:
#Kernel DAG

Kernel DAG是将一个graph里的每个strongly connected component变成一个vertex, 然后把所有这些SCCs给连起来. 得到的结果一定是一个DAG. 比如Tushar视频里的这个图:


Reference: Strongly Connected Components Kosaraju’s Algorithm Graph Algorithm from Tushar Roy

4个SCCs连起来的一定是一个DAG. 假设有一条边从(DEF)到(ABC),那么很容易证明(ABCDEF)是一个SCC而不是两个.

#Why Kernal DAG ?

Kernel DAG解释了为什么我们要先把G到倒过来求topological order.

用上图的例子来看, 这个order保证了在第二遍找SCCs的DFS的时候我们会从(DEF)中任何一个结点或者(K)来开始DFS, 假设从(DEF)中任何一个结点开始, 根据上面的图很显然我们没有办法走到ABC或者GHIJ. 假设从K开始也很显然没有办法走到GHIJ.

这样就能找到(DEF)和(K)两个strongly connected component. 接下来无论从(ABC)还是(GHIJ)中任意一个结点开始都能顺利找到自己的strongly connected component.

这样就证明了这个算法的可行性.

#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
public class StronglyConnectedComponent {
public static List<Set<Integer>> scc(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final List<Set<Integer>> res = new ArrayList<>();
final Map<Integer, Set<Integer>> reversedGraph = reverseGraph(graph);
final List<Integer> sorted = TopologicalSort.topoDFS(graph);
final Set<Integer> visited = new HashSet<>();
for (final int v : sorted) {
if (!visited.contains(v)) {
final Set<Integer> currentComponent = new HashSet<>();
getSCCByDFS(v, graph, visited, currentComponent);
res.add(new HashSet<>(currentComponent));
}
}
return res;
}
private static Map<Integer, Set<Integer>> reverseGraph(final Map<Integer, Set<Integer>> graph) {
final Map<Integer, Set<Integer>> reversedGraph = new HashMap<>();
for (final int v : graph.keySet()) {
reversedGraph.putIfAbsent(v, new HashSet<>());
for (final int adj : graph.get(v)) {
reversedGraph.putIfAbsent(adj, new HashSet<>());
reversedGraph.get(adj).add(v);
}
}
return reversedGraph;
}
private static void getSCCByDFS(final int v, final Map<Integer, Set<Integer>> graph, final Set<Integer> visited, final Set<Integer> scc) {
scc.add(v);
visited.add(v);
for (final int adj : graph.keySet()) {
if (!visited.contains(adj)) {
scc.add(adj);
visited.add(adj);
getSCCByDFS(adj, graph, visited, scc);
}
}
}
}

#Connected Component in Undirected Graph

判断Undirected Graph是否connect比较简单, 所以随便找一个结点开始dfs, 如果所有其他结点都能被reach, 那么这个graph就是connected的. 因为如果 v - w 的话, v能到w, w也能到v.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public ConnectedComponent {
public static boolean isConnected(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Set<Integer> visited = new HashSet<>();
for (final int v : graph.keySet()) {
dfs(v, graph, visited);
break;
}
for (final int v : graph.keySet()) {
if (!visited.contains(v)) {
return false;
}
}
return true;
}
}

#5. Euler Cycle

欧拉回路: A Cycle with each edge exactly once.

#Directed Graph

A directed graph has an Euler Cycle iff:

  1. Every vertex with non-zero degree belong to a single strongly connected component
  2. Every vertex must have the same indegree and outdegree
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public static boolean hasEulerCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
return isStronglyConnected(graph) && hasSameDegree(graph);
}
private static boolean isStronglyConnected(final Map<Integer, Set<Integer>> graph) {
final List<Set<Integer>> scc = StronglyConnectedComponent.scc(graph);
return scc.size() == 1;
}
private static boolean hasSameDegree(final Map<Integer, Set<Integer>> graph) {
final Map<Integer, Integer> degrees = new HashMap<>();
for (final int v : graph.keySet()) {
degrees.putIfAbsent(v, 0);
for (final int adj : graph.get(v)) {
degrees.putIfAbsent(adj, 0);
degrees.put(v, degrees.get(v) - 1);
degrees.put(adj, degrees.get(v) + 1);
}
}
for (final int v : graph.keySet()) {
if (degrees.get(v) != 0) {
return false;
}
}
return true;
}
#Hierholzer’s Algorithm

Find Euler path/circuit in Directed graph

特别注意这个implementation. 当处理Euler Path/Cycle的问题时,如果需要自己构建graph, 不需要用Inner class来构建Edge, 然后再用类似于Set的数据结构来记录edge的访问情况.

仔细理解一下下面的code, 用一个Queue来构建adjacency list会省了很多事.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public static List<Integer> findEulerPath(final Map<Integer, Queue<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final List<Integer> res = new LinkedList<>();
// if there is no cycle, src must have 0 indegree.
final int src = computeIndegreeAndGetSource(graph);
if (src == -1) {
// Euler Cycle, any vertex can be a source, since it's a cycle.
for (final int v : graph.keySet()) {
dfs(v, graph, res);
break;
}
} else {
// Euler Path
dfs(src, graph, res);
}
return res;
}
private void dfs(final int v, final Map<Integer, Queue<Integer>> graph, final List<Integer> res) {
while (!graph.get(v).isEmpty()) {
final int adj = graph.get(v).poll();
dfs(adj, graph, res);
}
res.add(0, v);
}

#Undirected Graph

An undirected graph has an Euler Cycle iff:

  1. Every vertex with non-zero degree belong to a single connected component
  2. Every vertex must have even number as its degree

Implementation中注意区分Directed Graph里的isStronglyConnected和Undirected Graph里的isConnected. Strongly Connected Component的概念只有在Directed Graph里才会出现!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public static boolean hasEulerCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
return ConnectedComponent.isConnected(graph) && getNumberOfVerticesWithOddDegree(graph) == 0;
}
private static int getNumberOfVerticesWithOddDegree(final Map<Integer, Set<Integer>> graph) {
int count = 0;
for (final int v : graph.keySet()) {
if (graph.get(v).size() % 2 != 0) {
count++;
}
}
return count;
}

An undirected graph has an Euler Path iff:

  1. Every vertex with non-zero degree belong to a single strongly connected component
  2. Only 0 or 2 vertices are allowed to have odd number as its degree, all other vertices must have even number as their degrees.

找path和找cycle几乎一模一样, 唯一的区别就是判断奇数degree的结点个数是不是0或者2

1
2
3
4
5
6
7
8
public static boolean hasEulerPath(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final int oddDegrees getNumberOfVerticesWithOddDegree(graph);
return ConnectedComponent.isConnected(graph) && (oddDegrees == 0 || oddDegrees == 2);
}

#6. Hamilton Cycle

Hamilton Cycle: A cycle with each vertex exactly once.

Hamilton Cycle是一个Backtracking问题!

#Directed Graph

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public static boolean hasHamiltonCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Set<Integer> visited = new HashSet<>();
final List<Integer> res = new ArrayList<>();
for (final int v : graph.keySet()) {
visited.add(v);
res.add(v);
if (backtracking(v, graph, visited, res)) {
return true;
}
res.remove(0);
visited.remove(v);
}
return false;
}
private static boolean backtracking(final int v, final Map<Integer, Set<Integer>> graph, final Set<Integer> visited, final List<Integer> res) {
if (res.size() == graph.size()) {
// If the last node in res list is connected to the first node in the res list
final int lastNode = res.get(res.size() - 1);
final int firstNode = res.get(0);
if (graph.get(lastNode).contains(firstNode)) {
return true;
} else {
return false;
}
}
for (final int adj : graph.get(v)) {
if (!visited.contains(adj)) {
visited.add(adj);
res.add(adj);
if (backtracking(adj, graph, visited, res)) {
return true;
}
res.remove(res.size() - 1);
visited.remove(adj);
}
}
return false;
}

#Undirected Graph

Hamilton Cycle对于Undirected Graph依旧是一个Backtracking的problem

和Directed Graph唯一的区别就是要pass一个parent避免无限循环.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public static boolean hasHamiltonCycle(final Map<Integer, Set<Integer>> graph) {
if (graph == null) {
return new IllegalArgumentException();
}
final Set<Integer> visited = new HashSet<>();
final List<Integer> res = new ArrayList<>();
for (final int v : graph.keySet()) {
visited.add(v);
res.add(v);
if (backtracking(v, graph, visited, res, v)) {
return true;
}
res.remove(0);
visited.remove(v);
}
return false;
}
private static void backtracking(final int v, final Map<Integer, Set<Integer>> graph, final Set<Integer> visited, final List<Integer> res, final int parent) {
if (res.size() == graph.size()) {
final lastNode = res.get(res.size() - 1);
final firstNode = res.get(0);
if (graph.get(lastNode).contains(firstNode)) {
return true;
} else {
return false;
}
}
for (final int adj : graph.get(v)) {
if (adj != parent && !visited.contains(adj)) {
visited.add(adj);
res.add(adj);
if (backtracking(adj, graph, visited, res, v)) {
return true;
}
res.remove(res.size() - 1);
res.remove(adj);
}
}
return false;
}

#7. Minimum Spanning Tree

Given: Undirected Graph G with positive edge weights (connected)

Definition: A spanning tree of G is a subgraph that is both a tree (connected and acyclic) and spanning (includes all the vertices)

Goal: Find a spanning tree with minimum weight

下面两种Algorithm都属于Greedy algorithm

#Kruskal’s Algorithm

利用Union Find.

  1. Sort all edges by weight
  2. Add next edge to Tree unless its creating a cycle (Union-Find)
#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public class KruskalMST {
private class Edge {
int v;
int w;
int weight;
public Edge(final int v, final int w, final int weight) {
this.v = v;
this.w = w;
this.weight = weight;
}
}
public static Queue<Edge> mstByKruskal(final Map<Integer, Map<Integer, Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Queue<Edge> mst = new LinkedList<>();
final PriorityQueue<Edge> minHeap = new PriorityQueue<>((x, y) -> x.weight - y.weight);
// 因为MST只作用于Undirected Graph. 所以在处理Edge的时候需要去重.
for (final int v : graph.keySet()) {
if (!visited.contains(v)) {
addEdgeToMinHeap(v, graph, visited, minHeap);
}
}
while (!minHeap.isEmpty()) {
final Edge edge = minHeap.poll();
if (!uf.find(edge.v, edge.w)) {
uf.union(edge.v, edge.w);
mst.offer(edge);
}
}
return mst;
}
private void addEdgeToMinHeap(final int v, final Map<Integer, Map<Integer, Integer>> graph, final Set<Integer> visited, final PriorityQueue<Edge> minHeap) {
visited.add(v);
for (final int adj : graph.get(v).keySet()) {
if (!visited.contains(adj)) {
minHeap.offer(new Edge(v, adj, graph.get(v).get(adj)));
addEdgeToMinHeap(adj, graph, visited, minHeap);
}
}
}
}
#Performance

Time: O(E * log(E))

构建minHeap的过程其实是O(E). 但是之后按次序遍历每条edge并把它加入tree, 是O(E * log(E))的操作. 因为union-find可以保证union和find都是O(log(E))

#Prim’s Algorithm

  1. Start with vertex 0 and greedily grow tree.
  2. Add to tree the minimum weight edge with exactly one endpoint in tree.
  3. Repeat until V - 1 edges.
#Lazy Implementation

Maintain priority queue by Edges

和Kruskal’s Algorithm一样, 需要维护一个MinHeap,
但是和Kruskal Algorithm不一样的是, 不需要把所有Edge都一股脑的先加到PriorityQueue里面.

因为我们每次要找的Edge得保证至少一个endpoint在tree里, 所以在minHeap里的东西只需要是所有跟当前mst connect的edges即可.

#Code

最终的结果mst可以有多种返回方式, 比如:

  1. 可以返回一个Queue
  2. 也可以返回一个Set

如果是返回一个Set的话, 因为mst的定义就是要包含图中所有vertices, 所以如果mst就可以把它当做visited来用. 而如果mst是一个Queue, 那么需要另外一个visited的数据结构来记录访问情况.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
public class LazyPrimMST {
public static Queue<Edge> mstByLazyPrims(final Map<Integer, Map<Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final PriorityQueue<Edge> minHeap = new PriorityQueue<>((x, y) -> x.weight - y.weight);
final Queue<Edge> mst = new LinkedList<>();
// Start with any ONE of the vertices
for (final int v : graph.keySet()) {
visit(v, graph, minHeap, mst);
break;
}
while (!minHeap.isEmpty()) {
final Edge edge = minHeap.poll();
if (mst.contains(edge.v) && mst.contains(edge.w)) {
continue;
} else {
mst.offer(edge);
if (!visited.contains(edge.v)) visit(edge.v);
if (!visited.contains(edge.w)) visit(edge.w);
}
}
return mst;
}
private static void visit(final int v, final Map<Integer, Map<Integer, Integer>> graph, final PriorityQueue<Edge> minHeap, final Set<Integer> visited) {
visited.add(v);
for (final int adj : graph.get(v)) {
if (!visited .contains(adj)) {
minHeap.offer(new Edge(v, adj, graph.get(v).get(adj)));
}
}
}
}
#Performance

O(E * log(E)) Time

O(E) space

#Eager Implementation

Maintain priority queue of Vertices.

Eager Implementation 需要利用IndexHeapupdate priority. 因为这个方法是将vertex存在priority queue中, 排序方式是vertex与mst之间的weight.

举个栗子:
比如 v - w 的weight是5, v在mst中, w不在. 因为w connected to mst, 所以w结点现在应该在MinHeap中. 假设MinHeap中的top结点是x, x和w之间也有一条边, weight是1。那么当把x加入mst之后, 我们要更新w和tree之间的weight, 从原来的5变为更小的1.

#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public class EagerPrimMST {
public static Queue<Edge> mstByEagerPrims(final Map<Integer, Map<Integer, Integer>> graph) {
if (graph == null) {
throw new IllegalArgumentException();
}
final IndexHeap<Integer> minIndexHeap = new IndexHeap<>((x, y) -> x.weight - y.weight);
final Queue<Edge> mst = new LinkedList<>();
final Set<Integer> visited = new HashSet<>();
// Store the minimum weight edge from a vertex to current MST
// Not every edge to a vertex!
final Map<Integer, Edge> vertexToMinWeightEdge = new HashMap<>();
// Start with any one of the vertices
for (final int v : graph.keySet()) {
minIndexHeap.decreaseKey(v, 0);
break;
}
while (!minIndexHeap.isEmpty()) {
final int v = minIndexHeap.poll();
visited.add(v);
// vertexToMinWeightEdge存放的是结点v到当前mst中weight最小的edge.
if (vertexToMinWeightEdge.containsKey(v)) {
mst.offer(vertexToMinWeightEdge.get(v));
}
for (final int adj : graph.get(v)) {
// 之所以没有直接negate condition而是要continue,
// 是因为想要强调obsolete edges的概念.
if (visited.contains(adj)) continue; // Obsolete edge
if (graph.get(v).get(adj) < minIndexHeap.getWeight()) {
if (minIndexHeap.contains(adj)) {
minIndexHeap.decreaseKey(adj, graph.get(v).get(adj));
} else {
minIndexHeap.add(adj, graph.get(v).get(adj));
}
vertexToMinWeightEdge(adj, new Edge(v, adj, graph.get(v).get(adj)));
}
}
}
return mst;
}
}
#Performance

O(E * log(V)) Time

O(V) space


#8. Single Source Shortest Path Problem (SSSP)

#Dijkstra Algorithm

Only work for non-negative weighed directed graph (不一定是acyclic).

#Dijkstra vs Eager Prim

They are essentially the same algorithm.

  • Prim’s algorithm picks the closest vertex to the Tree.
  • Dijkstra algorithm picks the closest vertex to the Source.

Both of them are trying to find the spanning tree. (Dijkstra is to find the shortest path from a source to every other vertices.)

  • Prim’s algorithm is for the undirected graph.
  • Dijkstra algorithm is for the directed graph (with no negative weights).
#Performance

Time: O(E * log(V)) for IndexedHeap

#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public class Dijkastra {
public static Map<Integer, Integer> shortestPath(final Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null || !graph.containsKey(source)) {
throw new IllegalArgumentException();
}
/* Better to verify if the graph only contains non-negative weight */
final IndexHeap<Integer> minIndexHeap = new IndexHeap<>((x, y) -> x.weight - y.weight);
final Map<Integer, Integer> dist = new HashMap<>();
final Map<Integer, Integer> parent = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
minIndexHeap.add(source, 0);
while (!minIndexHeap.isEmpty()) {
final int v = minIndexHeap.poll();
for (final int adj : graph.get(v)) {
relax(adj, v, graph.get(v).get(adj), minHeap, dist, parent);
}
}
return dist;
}
private void relax(final int v, final int parent, final int weight,
final IndexHeap<Integer> minIndexHeap, final Map<Integer, Integer> dist, Map<Integer, Integer> parent) {
final int updatedWeight = dist.get(parent) + weight;
if (updatedWeight < dist.get(v)) {
dist.put(v, updatedWeight);
parent.put(v, parent);
if (minIndexHeap.containsKey(v)) {
minIndexHeap.decreaseKey(v, updatedWeight);
} else {
minIndexHeap.add(v, updatedWeight);
}
}
}
}

#Topological sort to find shortest path

Only work for weighed DAG, either positive or negative weight

#Algorithm of Acyclic Shortest Path
  1. Topological Sort
  2. Relaxing based on topological order
#Performance

Time: O(E + V)

#Proof

因为是Acyclic, 而且是topological order, 所以each edge is relaxed exactly once.

  1. distTo[w]不可能increase, 因为我们只有当distTo[v] + weight(v, w) < dist[w]的时候才更新dist[w]
  2. distTo[v]不可能再有所改变. 因为是topological order, 当处理到w的时候可以保证已经没有任何未处理的edge再指向v.
#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public class AcyclicShortestPath {
public static Map<Integer, Integer> acyclicShortestPath(final Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Map<Integer, Integer> dist = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
dist.put(v, 0);
final List<Integer> topologicalOrder = TopologicalSort.topologicalSortDFS(graph);
for (final int v : topologicalOrder) {
for (final int adj : graph.get(v)) {
relax(adj, v, graph.get(v).get(adj), dist);
}
}
return dist;
}
private static void relax(final int v, final int parent, final int weight, Map<Integer, Integer> dist) {
final int updatedWeight = dist.get(parent) + weight;
if (updatedWeight < dist.get(v)) {
dist.put(v, updatedWeight);
}
}
}
#How to find longest path ?
  1. Negate all weights. (This solution allows negative weights)
  2. Run AcyclicShortestPath Algorithm.
  3. Negate back the results.
#Parallel Job Scheduling Problem

Given a set of jobs with durations and precedence constraints, schedule the jobs by finding the start time for each job, so as to achieve the minimum completion time, while respecting the constraints.

This is a longest path problem

Solution:

  1. Create an edge-weighted DAG.
  2. Two virtual vertices representing:
    • Source: “Before everything start”
    • Destination: “After everything is done”
  3. Each job will create 2 vertices
    • Job start
    • Job end
    • The weight is the duration
    • There will be at least 3 edges for each job (not vertex)
      • Job start –> Job end
      • Source –> Job start
      • Job end –> Destination
  4. Each constraint is an edge.

A critical path is the solution to this problem.

  • A critical path is the longest path from source to destination.

证明:

  1. Job和Job之间的先后顺序只会产生一条edge, 但不会有weight. 也就是weight是0.
  2. 有weight的都是每一个Job的Duration.

所以比如下图这种情况:

要保证每个Job都完成的话,只能取longest path而不是shortest path.

#Bellman-Ford Algorithm

Work for Negative Weights on directed graph without negative cycle

Dijkstra Algorithm no longer works!

Reference from Princeton’s Algorithm course.

  • 这个slides里的内容很有意思. 先是举例子说明了为什么Dijkstra Algorithm不能在Negative weighted graph上work.

  • 然后可能有人会想出上面Re-weighting的trick. 但是这样的想法是错误的,因为即使这样得到了一个positive weighted graph, 但是shortest path就不是原来那条了!

#Algorithm

Relax all edges (in any order) V - 1 times

#Why V - 1 ?

Relaxing edges的order很大程度上决定了Bellman-Ford algorithm找到shortest path的快慢.

比如Tushar的视频里举了一个例子:

Reference from Tushar Roy’s youtube video

假设有这么一个简单的graph, source是0. 如果按照左边的edge顺序来进行relax, 很显然第一遍iteration只能更新V2的distance. 第二遍iteration只能更新V3的distance. 最后一遍iteration才能找到V4的shortest path.

但是如果edges的relax顺序换成右边的, 那么在第一轮iteration的时候就能找出所有vertices的shortest distance.

所以在worst case的情况下, 需要进行V - 1次relaxing.

#Performance
  1. Single Source Shortest Path:

    • O(V * E) time
    • O(V) space
  2. All Sources Shortest Path:

    • Apply Bellman-Ford V times. So O(V^2 * E) times.

What's the maximum number of edges with V vertices ?

对于每个vertex, 它最多能有V - 1条边, 那么对于整个graph, 最多能有V * (V - 1)条边. 因此Bellman-Ford algorithm在All sources shortest path problem中worst case的时间复杂度是O( V^4 ). Floyd-Warshall algorithm在这个All sources shortest path problem中只需要O(V^3 )

#Negative Cycle

A directed cycle whose sum of weights is negative

A Shortest path exists iff no negative cycle

  • 因为如果有negative cycle的话, shortest path可以沿着这个cycle无限循环下去, 因为是negative的关系,每绕一圈total weight就更小一点.
#How to find negative cycle ?

上面已经证明了worst case情况下需要进行 V - 1 次relaxing能找到SSSP. 那么我们只需要再多进行一次relax, 如果有vertex的shortest distance有变化(变得更小, 因为relax只有更小才更新), 那么就说明有negative cycle.

#Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
public class BellmanFord {
public void Map<Integer, Integer> shortestPath(Map<Integer, Map<Integer, Integer>> graph, final int source) {
if (graph == null) {
throw new IllegalArgumentException();
}
final Map<Integer, Integer> dist = new HashMap<>();
final Map<Integer, Integer> parent = new HashMap<>();
for (final int v : graph.keySet()) {
dist.put(v, Integer.MAX_VALUE);
}
dist.put(source, 0);
for (int i = 0; i < graph.size(); i++) {
relax(graph, dist, parent);
}
// One more relax to detect negative cycle
if (hasNegativeCycle(graph, dist)) {
throw new IllegalArgumentException();
}
return dist;
}
private void relax(final Map<Integer, Map<Integer, Integer>> graph, final Map<Integer, Integer> dist, final Map<Integer, Integer> parent) {
for (final int v : graph.keySet()) {
for (final int adj : graph.get(v)) {
final int updatedWeight = dist.get(v) + graph.get(v).get(adj);
if (updatedWeight < dist[adj]) {
dist.put(adj, updatedWeight);
parent.put(adj, v);
}
}
}
}
private boolean hasNegativeCycle(final Map<Integer, Map<Integer, Integer>> graph, final Map<Integer, Integer> dist) {
for (final int v : graph.keySet()) {
for (final int adj : graph.get(v)) {
final int updatedWeight = dist.get(v) + graph.get(v).get(adj);
if (updatedWeight < dist[adj]) {
return true;
}
}
}
return false;
}
}

#9. Maximum Flow Min Cut Problem

Work for positive weighted directed graph with a source and a target.

#Min Cut Problem

#Cut
  • A cut is a partition of vertices into two disjoint sets
#Capacity of cut
  • A capacity is the sum of capacities of edges from set A to set B

  • 说白了Cut就是把图分为两部分,一部分结点涂黑色,另一部分结点涂白色. Capacity就是所有从黑色结点到白色结点的edges,把它们各自的capacity加起来的总和. Min Cut Problem就是找出怎么分, 能让Capactiy最小.

#Maximum Flow Problem

#Flow of edge
  • Edge的flow就是小于等于edge capacity的一个值.
  • 对于每个vertex, inflow = outflow
#Value of a flow
  • flow的value被定义为target的inflow

#Ford-Fulkerson Algorithm

#Algorithm
  1. Find an Augmenting path (an undirected graph) from s to t:
    • Can increase flow on forward edge (not full)
    • Can decrease flow on backward edge (not empty)
  2. Repeat to find more augmenting paths
  3. Terminate when all paths from s to t are blocked by either:
    • Full forward edge
    • Empty backward edge
#Maximum-Mincut Theorem
#Net flow across a cut

A net flow across a cut(A, B) is the sum of flows on edges from set A to set B, minus the sum of flows on edges from set B to set A

#Flow-value lemma

Let f be any flow and let (A, B) be any cut, then net flow across (A, B) equals the value of f.

#Weak Duality

Let f be any flow and let (A, B) be any cut, then the value of the flow <= the capacity of the cut.

证明:

value of flow = net flow across(A, B) <= capacity of cut

  • value of flow = net flow across 是因为Flow-value lemma
  • net flow across(A, B) <= capacity of cut 是因为net flow across cut的计算方法就是capacity of cut(A, B)减去capacity of cut(B, A)
#Augmenting path theorem

A flow f is a maximum flow iff no augmenting path

#Maximum-Mincut theorem

Value of the maximum flow = capacity of mincut

#Proof

有三个条件, 如果能证明这三个条件之间能够互相推理的通, 就能证明上面两个theorems:

  1. There exists a cut whose capacity equals the value of the flow f
  2. f is a maximum flow
  3. There is no augmenting path with respect to f

证明 1 => 2: Weak Duality, value of the flow <= the capacity of the cut
证明 2 => 3: 假设有一条augmenting path, 那么这条path肯定增加flow, 所以f肯定不是maximum flow
证明 3 => 1:

  • 假设已经没有Augmenting path from s to t.
  • 再假设有一个cut(A, B), A is a set of vertices connected to s by an undirected path with no full forward edges or empty backward edges.
  • 也就是说从s开始,只能沿着还没有满的forward edge, 或者非零的backward edge走,能reach到的结点都属于set A. 其余的都属于set B.
  • 因为已经假设没有Augmenting path了,所以s和t必然属于两个不同的sets.
  • 那么可以证明capacity of cut = net flow across cut. 因为从s开始能reach到结点都已经reach到了, 说明cross edge只有两种可能性:
    1. Full forward edge
    2. Empty backward edge
  • 否则的话这些edges能够继续延伸下去, 之后的结点就会属于set A.
  • 证明了capacity of cut = net flow across cut之后, 就可以利用Flow-value lemma来证明:
    • capacity of cut = net flow across cut = value of f
#Implementation

Flow network因为每条edge上有flow和capacity两个概念所以并不容易represent. 因此引入了一个叫做residual network的概念.

  • Forward edge: Residual capacity = (Capacity of edge) - (flow of edge)
  • Backward edge: Residual capacity = flow of edge

#FlowEdge Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public class FlowEdge {
@Getter
private final int v, w;
@Getter
private final int capacity;
@Getter
private int flow;
public FlowEdge(final int v, final int w, final int capacity) {
this.v = v;
this.w = w;
this.capacity = capacity;
}
public int other(final int vertex) {
if (vertex == v) return w;
else if (vertex == w) return v;
else throw new IllegalArgumentException;
}
public int residualCapacityTo(final int vertex) {
if (vertex == v) return flow;
else if (vertex == w) return capacity - flow;
else throw new IllegalArgumentException;
}
public int addResidualFlowTo(final int vertex, final int delta) {
if (vertex == v) flow -= delta;
else if (vertex == w) flow += delta;
else throw new IllegalArgumentException;
}
}
#Ford-Fulkerson code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public class Ford-Fulkerson {
private Map<Integer, FlowEdge> edgeTo;
private Set<Integer> visited;
public static int maximumFlow(final FlowNetwork graph, final int s, final int t) {
int res = 0;
while (hasAugmentingPath(graph, s, t)) {
int bottle = Integer.MAX_VALUE;
int v = t;
while (v != s) {
bottle = Math.min(bottle, edgeTo.get(v).residualCapacity(v));
v = edgeTo.get(v).other();
}
v = t;
while (v != s) {
edgeTo.get(v).addResidualFlowTo(v, bottle);
v = edgeTo.get(v).other();
}
res += bottle;
}
return res;
}
private static boolean hasAugmentingPath(final FlowNetwork graph, final int s, final int t) {
// Reinitialize every time
edgeTo = new HashMap<>();
visited = new HashSet<>();
final Queue<Integer> queue = new LinkedList<>();
queue.offer(s);
visited.add(s);
while (!queue.isEmpty()) {
final int front = queue.poll();
for (final FlowEdge edge : graph.adj(front)) {
final int w = edge.other();
if (edge.residualCapacityTo(w) > 0 && !visited.contains(w)) {
edgeTo.put(w, edge);
visited.add(w);
queue.offer(w);
}
}
}
return visited.contains(t);
}
}

#10. Traveling Salesman Problem

TBD

#Reference

Princeton Algorithm Course

Tushar Roy

Comments