## #14 Bipartite Problem

### #14.1 Decide whether a given undirected graph is a bipartite graph

Algorithm

BFS

Java IsBipartite

TODO

Python IsBipartite

### #14.2 Maximum Cardinality Bipartite Matching (MCBM)

Some examples of unweighted bipartite matching problem that could use maximum flow algorithm, like Ford-Fulkerson to solve:

• Mice and Owls problem
• Book selection problem
• Elementary Math problem

## #13. Network Flow

### #13.1 Concept

#### #What is a Flow Graph?

A flow graph is a `directed`, where each edge (also called an arc) has a certain capacity

#### #What is an Augmenting Path?

Augmenting path is a path of edges in the residual graph with remaining capacity greater than 0 from the source s to the destination t. • How to find the augmenting path is flexible in Ford-Fulkerson algorithm, there are multiple ways.
• Every augmenting path has a `bottle-neck`, which is the smallest `(capacity - flow)` on the path.
• In the above example, the bottle-neck is edge (3, 0)
• `Augmenting the flow` means updating the flow values along the augmenting path.
• For forward edge, increasing the flow by the bottle-neck value.
• For backward edge, also called `residual edge`, decreasing the flow by the bottle-neck value
• `Residual edge` exist to undo bad augmenting paths that do not lead to a maximum flow

#### #What is Residual Graph?

• Think every edge in the original graph as also having a residual edge with a flow/capacity of `0/0`
• The resulting graph is a residual graph
• Example of residual graph ### #13.2 Ford-Fulkerson Algorithm

Algorithm

• Keep finding augmenting path, then augments the flow until no more augmenting path from s to t
• The sum of the bottle-necks found in each augmenting path is equal to the max flow.
• The flow into the destination is also the max flow

Assuming the method of finding augmenting paths is by using DFS, then the Ford-Fulkerson algorithm runs in `O(fE)`, where f is the maximum flow and E is the number of Edges.

There are other ways to find the augmenting path, `Edmonds-Karp` is the famous one.

Python Ford-Fulkerson Algorithm

### #13.3 Edmonds Karp Algorithm

• Using BFS instead of DFS to find the augmenting path.
• Takes `O(VE^2 )` time, the difference is this time complexity no longer depends on the capacity value of any edge.

Python Ford-Fulkerson with Edmonds Karp

### #13.4 Capacity Scaling

capacity scaling is another technique to use heuristic to efficiently find augmenting path.

1. Find the largest capacity in the flow graph, call it U
2. Find the delta, which is the largest 2^x that’s less or equal to U
3. Find augmenting path with capacity > than delta
4. If no more augmenting path has capacity greater than delta, then delta /= 2
5. Repeat until delta <= 0

### #13.5 Dinic Algorithm

Another heuristic to efficiently find augmenting path. 太难了.

## #12. Traveling Salesman Problem

• Dynamic Programming problem

Python Traveling Salesman Problem

## #11. Shortest Path

• Topo sort + relax: O(E + V) time. Only work for `DAG`
• Dijkstra, work for graph without negative weight. 可以有环.
• Bellman-Ford, work for graph without negative cycle, 可以有negative weight, 也可以有环
• Floyd-Warshall, all sources shortest path algorithm
Algorithm Condition Time Space
Topological Sort + Relaxing DAG O(V) O(V)
Dijkstra Without Negative Edge O((E + V)log(V))
Bellman-Ford Without Negative Cycle O(EV)
Floyd-Warshall All pairs shortest path O(V^3 )

### #11.1 Topological Sort to Find Shortest Path for Directed Acyclic Graph (DAG)

`Only work for weighed DAG, either positive or negative weight`

#### #Algorithm of Acyclic Shortest Path

1. Topological Sort
2. Relaxing based on topological order

#### #Proof

1. distTo[w]不可能increase, 因为我们只有当distTo[v] + weight(v, w) < dist[w]的时候才更新dist[w]
2. distTo[v]不可能再有所改变. 因为是topological order, 当处理到w的时候可以保证已经没有任何未处理的edge再指向v.

Java Acyclic Shortest Path Problem

Python Acyclic Shortest Path Problem

### #11.2 Longest Path Problem

• For general longest path problem, it’s an NP-Hard problem
• For DAG, use topological sort to find the longest path

#### #Algorithm

1. Negate all weights. (This solution allows negative weights)
2. Run AcyclicShortestPath Algorithm.
3. Negate back the results.

#### #11.2.1 Parallel Job Scheduling Problem

Given a set of jobs with durations and precedence constraints, schedule the jobs by finding the start time for each job, so as to achieve the minimum completion time, while respecting the constraints.

`This is a longest path problem` Solution:

1. Create an edge-weighted DAG.
2. Two virtual vertices representing:
• Source: “Before everything start”
• Destination: “After everything is done”
3. Each job will create 2 vertices
• Job start
• Job end
• The weight is the duration
• There will be at least 3 edges for each job (not vertex)
• Job start –> Job end
• Source –> Job start
• Job end –> Destination
4. Each constraint is an edge.

The longest path of this problem is also called a “critical path” 1. Job和Job之间的先后顺序只会产生一条edge, 但不会有weight. 也就是weight是0.
2. 有weight的都是每一个Job的Duration. • 要保证每个Job都完成的话，只能取longest path而不是shortest path.

### #11.3 Dijkstra Algorithm

`Only work for non-negative weighed directed graph (可以有环)`.

Dijkstra有两种implementation:

1. Lazy Implementation
• not good for dense graph
• lots of duplicated tuples
2. Eager Implementation with `Indexed Heap`

[Notes]

Dijkstra中如果一个node被visit过了, 表示这个node到source的最短距离已经被找到了, 不能improve了, 这也是为什么Dijkstra不能有negative weight的原因.

#### #Dijkstra vs Eager Prim

They are essentially the `same` algorithm.

• Prim’s algorithm picks the closest vertex to the Tree.
• Dijkstra algorithm picks the closest vertex to the Source.

`Both of them are trying to find the spanning tree.` (Dijkstra is to find the shortest path from a source to every other vertices.)

• Prim’s algorithm is for the undirected graph.
• Dijkstra algorithm is for the directed graph (with no negative weights).

#### #Performance

Java Dijkstra Eager Implementation

Python Dijkstra Lazy Implementation

### #11.4 Bellman-Ford Algorithm

`Work for Negative Weights on directed graph without negative cycle`

#### #11.4.1 Why Dijkstra doesn’t work for negative edge?

• 这个slides里的内容很有意思. 先是举例子说明了为什么Dijkstra Algorithm不能在Negative weighted graph上work.
• 然后可能有人会想出上面Re-weighting的trick. 但是这样的想法是错误的，因为即使这样得到了一个positive weighted graph, 但是shortest path就不是原来那条了! • Reference from Princeton’s Algorithm course.

#### #11.4.2 Algorithm

Relax all edges (in any order) `V - 1` times

Bellman-Ford is a Dynamic Programming Algorithm

#### #11.4.3 Proof: Why V - 1 ?

Check here to see why Bellman-ford algorithm works.

Another proof from Tushar Roy

Relaxing edges的`order`很大程度上决定了Bellman-Ford algorithm找到shortest path的快慢. #### #11.4.5 Bellman-Ford For All Sources Shortest Path Problem

We can apply Bellman-Ford `V` times to get all sources shortest distances. (Make every vertex as the single source)

So the time complexity is `O(V^2 * E)`. But What’s the maximum number of edges with V vertices ?

• 对于每个vertex, 它最多能有`V - 1`条边, 那么对于整个graph, 最多能有`V * (V - 1)`条边.
• 因此Bellman-Ford algorithm在All sources shortest path problem中worst case的时间复杂度是`O( V^4 )`.
• Floyd-Warshall algorithm在这个All sources shortest path problem中只需要`O(V^3 )`

#### #11.4.6 How to find Negative Cycle

A Shortest path exists iff no negative cycle

• 因为如果有negative cycle的话, shortest path可以沿着这个cycle无限循环下去, 因为是negative的关系，每绕一圈total weight就更小一点.

Algorithm

Java Bellman-Ford

Python Bellman-Ford

### #11.5 Floyd-Warshall Algorithm

Python Floyd-Warshall Algorithm

## #10. Minimum Spanning Tree

Given: `Undirected Graph` G with `positive` edge weights (connected)

Definition: A spanning tree of G is a subgraph that is both a tree (connected and acyclic) and spanning (includes all the vertices)

Goal: Find a spanning tree with minimum weight

### #10.1 Kruskal’s Algorithm

1. Sort all edges by weight
2. Add next edge to Tree unless its creating a cycle (Union-Find)

#### #Performance

Time: O(E * log(E))

Java Kruskal’s Algorithm

Python Kruskal’s Algorithm

TODO

### #10.2 Prim’s Algorithm

• Prims’s Algorithm Works best for dense graph
• lazy version has runtime O(E * log(E))
• eager version has runtime O(E * log(V))

#### #Algorithm

2. Add to tree the minimum weight edge `with exactly one endpoint in tree`.
3. Repeat until `V - 1` edges.

#### #10.2.1 Lazy Implementation

• Maintain priority queue by `Edges`, 和Kruskal’s Algorithm一样, 需要维护一个MinHeap,
• 但是和Kruskal Algorithm不一样的是, 不需要把所有Edge都一股脑的先加到PriorityQueue里面.
• 因为我们每次要找的Edge得保证至少一个endpoint在tree里, 所以在minHeap里的东西只需要是所有跟当前mst connect的edges即可.
• 最终的结果mst可以有多种返回方式, 比如:
1. 可以返回一个Queue
2. 也可以返回一个Set
• 如果是返回一个Set的话, 因为mst的定义就是要包含图中所有vertices, 所以如果mst就可以把它当做visited来用. 而如果mst是一个Queue, 那么需要另外一个visited的数据结构来记录访问情况.

Java Prim’s Lazy Implementation

Python Prims’s Lazy Implementation

#### #10.2.2 Eager Implementation

• Maintain priority queue of `Vertices`.
• Eager Implementation 需要利用`IndexHeap``update priority`. 因为这个方法是将vertex存在priority queue中, 排序方式是vertex与mst之间的weight.

Java Prim’s Eager Implementation

## #9. Hamilton Cycle / Path

Hamilton Cycle: A cycle with each `vertex` exactly once.

`Hamilton Cycle是一个Backtracking问题!`

### #9.1 Hamiltonian Cycle for Directed Graph

Java Hamiltonian Cycle

### #9.2 Hamiltonian Cycle for Undirected Graph

`Hamilton Cycle对于Undirected Graph依旧是一个Backtracking的problem`

Java Hamiltonian Cycle

## #8. Eulerian Cycle / Eulerian Path

Eulerian Cycle Eulerian Path
Undirected Graph Every vertex has an even degree Every vertex has an even degree or only 2 vertices can have odd degree
Directed Graph Every vertex has the same indegree and outdegree At most 1 vertex has (outdegree) - (indegree) = 1, and at most 1 vertex has (indegree)- (outdegree) = 1, and all other vertices has the same indegree and outdegree
• If a graph has an Eulerian Cycle, it must have an Eulerian Path

### #8.1 Eulerian Cycle for Directed Graph

A directed graph has an Euler Cycle iff:

1. Every vertex with non-zero degree belong to a single strongly connected component
2. Every vertex must have the same indegree and outdegree

[Notes]

• A singleton is a valid node in the eulerian graph. Because eulerian cycle or eulerian path visit each `edge` once, there is no need to visit that singleton node.
• In a directed graph, an eulerian path has only one starting node and only one end node. Starting node has `(outdegree) - (indegree) = 1`, while end node has `(indegree) - (outdegree) = 1`

Java: Check if the graph has Eulerian Cycle

### #8.2 Hierholzer’s Algorithm

Find Euler path/cycle in `Directed graph`

Java Hierholzer’s Algorithm

`仔细理解一下下面的code, 用一个Queue来构建adjacency list会省了很多事.`

Python Hierholzer’s Algorithm

### #8.2 Eulerian Cycle/Path for Undirected Graph

An undirected graph has an `Euler Cycle` iff:

1. Every vertex with non-zero degree belong to a single connected component
2. Every vertex must have `even` number as its degree

Implementation中注意区分Directed Graph里的`isStronglyConnected`和Undirected Graph里的`isConnected`. Strongly Connected Component的概念只有在Directed Graph里才会出现!

An undirected graph has an `Euler Path` iff:

1. Every vertex with non-zero degree belong to a single strongly connected component
2. Only 0 or 2 vertices are allowed to have `odd` number as its degree, all other vertices must have `even` number as their degrees.

`找path和找cycle几乎一模一样, 唯一的区别就是判断奇数degree的结点个数是不是0或者2`

## #7. Articulation Points (Cut Vertex)

1. What is Articulation Point?

• Articulation points are the nodes in a graph, whose removal increases the number of connected components

2. How to find articulation points?

1. If an edge (v, w) is a bridge, then either v or w is an articulation point. But articulation points can exist without any bridges.
• 2. If id(e.from) == lowlink(e.to), that means there is a cycle
• 3. A cycle must be a connected component, and e.from is the starting node of the cycle, which indicates it’s an articulation point.
• Except the starting node has no `outgoing` edges. Or only 1 outgoing edge, which is part of the cycle.
• 4. So an articulation point must have 2 or more outgoing edges.

[Notes] The “outgoing” here is a logical outgoing, it does not mean the outgoing edge in directed graph, since the whole problem is on an undirected graph. It really means the edges connected to the node.

Python Find Articulation Points

## #6. Bridges (Cut Edges)

1. What is Bridge?

• Bridges are the edges in a graph, whose removal increases the number of connected components

2. Why do we want find bridges?

• Bridges are often hint at weak points, bottle necks or vulnerabilities in a graph
3. How to find bridges?

• If (v, w) is an edge in the graph, the edge is a bridge iff `id(v) < lowlink(w)`

Python Find Bridges

## #5. Strongly Connected Component

1. What is connected graph ?

• A graph is `connected` if there is a path between each pair of vertices.
2. What is strongly connected graph ?

• A `directed graph` is `strongly connected` iff there is a path between each pair of vertices.
• Strongly connected的概念只存在于Directed Graph

[Notes]

Strongly connected components in G are the same as they are in reversed G!

### #5.1 Kosaraju’s Algorithm

• Algorithm:
1. Topological sort on reversed G!
2. Run DFS based on the order from step 1

#### #Proof of Kosaraju’s Algorithm:

1. What is `Kernel DAG`?

• Kernel DAG是将一个graph里的每个strongly connected component变成一个vertex, 然后把所有这些SCCs给连起来. 得到的结果一定是一个DAG. 比如Tushar视频里的这个图:
• • Reference: Strongly Connected Components Kosaraju’s Algorithm Graph Algorithm from Tushar Roy
• 4个SCCs连起来的一定是一个DAG. 假设有一条边从(DEF)到(ABC)，那么很容易证明(ABCDEF)是一个SCC而不是两个.
2. Why Kernal DAG ?

Kernel DAG解释了为什么我们要先把G到倒过来求topological order.

• 用上图的例子来看, 这个order保证了在第二遍找SCCs的DFS的时候我们会从(DEF)中任何一个结点或者(K)来开始DFS, 假设从(DEF)中任何一个结点开始, 根据上面的图很显然我们没有办法走到ABC或者GHIJ. 假设从K开始也很显然没有办法走到GHIJ.
• 这样就能找到(DEF)和(K)两个strongly connected component. 接下来无论从(ABC)还是(GHIJ)中任意一个结点开始都能顺利找到自己的strongly connected component. 这样就证明了这个算法的可行性.

Java Kosaraju’a Algorithm

Python Kosaraju’s Algorithm

### #5.2 Tarjan’s Algorithm

• `Low-link value`: low-link value of a node is the smallest node id reachable from that node when doing a dfs, including itself.

2. How to update low-link value to find Strongly Connected Component?

• If there is an edge from v to w in the graph, to update the low-link value of v by `lowlink[v] = min(lowlink[v], lowlink[w])`, the following 2 conditions must be met in order to find the strongly connected component:
1. there is a path from v to w
2. w must be on the stack
• If a node v is the starting point of a strongly connected component, then it’s `id(v) == lowlink(v)`
3. Algorithms:

1. Maintains a stack, node is added to the stack when they’re visited the first time
2. Nodes are removed from the stack when a complete SCC is found
3. If (v, w) is an edge in the graph, to update lowlink value of v, w must be on the stack.
4. If a node is the starting point the a SCC, it’s value (or id) must be equal to it’s lowlink value

Python Tarjan’s Algorithm

TODO

Java

Python

## #3. Cycle Detection

### #3.1 How to detect cycle?

1. DFS
2. BFS
3. Topological Sort (Only for Directed Graph)

#### #3.1.1 DFS to Detect Cycle

DFS Cycle Detection for Directed Graph

DFS Cycle Detection for Undirected Graph

#### #3.1.2 BFS to Detect Cycle

BFS Cycle Detection For Directed Graph

BFS Cycle Detection For Undirected Graph

Solution: 记录访问情况

parent只需要是一个`link`, `Map<Integer, Integer>`即可, 不需要是一个`Map<Integer, Set<Integer>>`. 因为即使有多个结点v1, v2指向w, 因为是undirected graph的关系, 完全可以看做是v1指向w, w再指向v2.

### #3.2 How to detect negative cycle?

1. Bellman-Ford
2. Floyd-Warshall

See more details in Shortest Path Section

## #2. Topological Sort

`Topological sort的概念只针对DAG! 有环的有向图也不行!`

Java

Python

Java

Python

## #1. Search

• Search Algorithms work for both Direceted and Undirected graph.

Java

Python

Java

Python

# #Graph - 0. Introduction

Before any graph problems, ask yourself the following 4 questions:

1. Directed or Undirected?
2. Weighted or Unweighted?
3. Sparse graph or Dense graph?

## #Graph Theory Problems with Cheat Sheet

1. Search
2. Topological Sort
3. Cycle Detection
4. Shortest Path Problem
• Single Source Shortest Path Problem (SSSP)
• Topological Sort (DAG)
• Dijkstra (without negative edge)
• Lazy Implementation
• Eager Implementation with IndexHeap
• Bellman-Ford (without negative cycle)
• All Sources Shortest Path Problem
• Floyd-Warshall
• Longest Path Problem
5. Connectivity
• Union-Find
• Strongly Connected Component
• Tarjan’s Algorithm
• Kosaraju’s Algorithm
• Bridges (Cut Edges)
• Articulation Points (Cut Vertex)
6. Eulerian Path / Eulerian Cycle
7. Hamiltonian Path / Hamiltonian Cycle
8. Minimum Spanning Tree
• Prim’s Algorithm
• Lazy implementation
• Eager implementation with IndexHeap
• Kruskal’s Algorithm
9. Traveling Salesman Problem
10. Network Flow Problem
• Ford-Fulkerson
• Edmonds-Karp
• Capacity Scaling
• Dinic Algorithm
11. Bipartite Problem
• Maximum Cardinality Bipartite Matching (MCBM)

## #Reference

Apply to all Graph related posts