Graph Algorithms

For an intro to graphs refer to this page for more details.

Exploring Undirected Graph

Let G=(V,E)G = (V, E) be an undirected graph.

Task: Find all vertices that can be reached from a given vertex vVv \in V.

def explore(v):
visited(v) = True
previsit(v) # see later
for each edge {u,v} in E do:
if visited(u) = False:
call explore(u)
postvisit(v) # see later

The number of vertices u such that “visited(u) = false” decreases in each recursive call. Since there is a finite number of vertices, the algorithm eventually terminates.

Lemma

Assume that, initially, visited(u) = False. After explore(v) has terminated, visited(u) = True     \iff there is a path from vv to uu.

Connected Components of G = (V, E)

def DFS (G):
for all v in V do:
visited (v) = false
cc = 0
for all v in V do:
if visited (v) = false:
cc = cc + 1
explore (v)

Run-time of DFS

First for-loop: O(V)O(|V|) time

Second for-loop:

  • explore(u) is called exactly once for each vertex u (this may be a recursive call)
  • time spent for explore(u), excluding recursive calls is O(1+degree(u))O(1 + degree(u))

Total time: O(V+uV(1+degree(u)))=O(V+V+2E)=O(V+E)O (|V| + \sum_{u \in V}(1+ \mathrm{degree}(u))) = O(|V| + |V| + 2|E|) = O(|V| + |E|)

Directed Graphs

Assume that G=(V,E)G = (V,E) is directed and acyclic.

Topological Sorting or Ordering is the numbering the vertices 1,2,...,n1, 2, ... , n such that for each edge (u,v),(u)<(v)(u,v), (u) < (v)

def TopologicalOrdering(G):
# Input: a directed acyclic graph G (V, E)
# Output: A topological ordering of V
k = 1
while V != {} do:
Choose a vertex u in V with indegree 0
Give u the number k
k = k + 1
Remove u from G

Prenumbers and Postnumbers

Let G=(V,E)G = (V, E) be a directed graph. For each vertex vVv \in V, we define the following two numbers with respect to Depth-First-Search.

  • pre(v): the first time we visit vv (the time at which explore(v) is called)
  • post(v): the time at which explore(v) is finished.

Use variable clock. At start, clock = 1.

def previsit(v):
pre(v) = clock
clock += 1
def postvisit(v):
post(v) = clock
clock += 1

4 Types of Edges

Tree edge

  • edge vuv \rightarrow u
  • explore(u)explore(u) is called as a recursive call within explore(v)explore(v)
  • solid edge
  • literally the edges we follow when we execute DFS

Forward edge

  • edge vuv \rightarrow u where in the (solid) tree,
  • uu is in subtree of vv
  • uu is not a child of vv
  • shortcut that brings you down the tree

Back edge

  • edge vuv \rightarrow u where in the (solid) tree,
  • vv in subtree of uu
  • enable you to go back in the tree

Cross edge

  • lets you go from one subtree to another
  • all other edges

Acyclic vs Cyclic

GG has a directed cycle if and only if DFS-forest has a back-edge.

How to test if a directed graph is cyclic?

  1. Run DFS (including pre/post-numbers)
  2. For each non-tree edge(v, u), test if pre(u) < pre(v) > post(v) < post(u)
  • if "yes" fpr at least one non-tree edge, return "cyclic"
  • if "no" for all non-tree edges, return "acyclic"

Running time: O(|V| + |E|)

Topological Sort for directed acyclic graph

  1. Run DFS (including pre/post-numbers)
  2. Run Bucket Sort to sort the vertices by post-number.
  3. Obtain the topological ordering from the reverse sorted order of the post-numbers.

Shortest Paths

Dijkstra's Algorithm

Input:

  • A directed graph G=(V,E)G = (V, E) where each edge (u,v)E(u,v) \in E has a weight wt(u,v)>0wt (u,v) > 0
  • A vertex sVs \in V called the source

Output:

  • For each vertex vVv \in V, (s,v)(s,v) = length of a shortest path from ss to vv.

For each vertex vVv ∈ V, maintain variable d(v)d(v) = length of a shortest path from ss to vv found so far.

At start, d(v)=0d(v) = 0 if v=s,v = s, \infty if vsv \neq s

Loop: Pick a vertex u for which d(u)=δ(s,u)d(u) = δ(s, u). For each edge (u,v),d(v)=mind(v),d(u)+wt(u,v)(u, v), d(v) = min {d(v), d(u) + wt(u, v)}

for each vertex v in V:
d(v) = infinity
d(s) = 0
S = {}
Q = V
while Q != {}:
u = vertex in Q for which d(u) is minimum
delete u from Q
insert u into S
for each edge (u,v):
d(v) = min (d(v), d(u) + wt(u,v))

Example:

  1. Let n=Vn = |V| and m=Em = |E|.
  2. Store QQ in a min-heap, where the key of each vertex is d(v)d(v).
  3. Initialization: O(n)O(n) (including the time to build the heap storing Q=VQ=V).
  4. One iteration:
  • Find uu and delete it from Q.
    • extract_min: O(log(n))O(log (n)) time.
  • For each edge (u, v), we update d(v)d(v)
    • decrease_key: O(log(n))O (log (n)) time.

Total time for one iteration: O(log(n))+O(outdegree(u)log(n))O(log(n)) + O(\mathrm{outdegree}(u) · log(n))
Total running time: O(n)+O(uV(O(log(n))+O(outdegree(u)log(n))))=O(n)+O(log(n)(n+2m))=O((m+n)log(n))O (n) + O (\sum_{u \in V} (O(log(n)) + O(\mathrm{outdegree}(u) · log(n)))) = O(n) + O (log(n)(n + 2m)) = O ((m + n) log(n))

Dijkstra Correctness

Theorem

Let G=(V,E)G = (V, E) be a weighted directed graph and sVs \in V be a source vertex. Dijkstra algorithm finds the lengths of the shortest paths from ss to all vertices in VV.

We prove by induction that

  1. if a vertex uu is in SS, then d(u)d(u) gives the length of a shortest path from ss to uu and
  2. if a vertex uu is not in SS, then d(u)d(u) gives the length of a shortest special path from ss to uu.

Base case:

  1. At the beginning, S=S = {}, so (1) above is vacuously true.
  2. Since S=S = {}, the other vertices simply cannot be reached by following a special path from ss. Since dd is initialized to \infty, then (2) above also holds when the algorithm starts.

Induction hypothesis: Assume that both (1) and (2) hold right before we add a new vertex v to S. We address induction steps (1) and (2) separately.

Induction step (1): By the induction hypothesis (1), before the addition of vv, we already know a shortest path from ss to all vertices that are in SS. Adding vv to SS does not change these shortest paths.

As for node vv, it is about to be inserted in SS. Before adding it to SS, we must check d(v)d(v) gives the length of a shortest path from ss to vv.

So we compare d(v)d(v) to the lengths of all paths from ss to vv. There are two kinds of paths:

  • the paths that are special
  • the ones that are not

By the induction hypothesis (2), we already know that d(v)d(v) is less than or equal to the length of any special path from ss to vv. A non-special path from ss to vv is one which contains at least one vertex xvx \neq v that is not in SS. But such a path has length at least d(x)d(v)d(x) \geq d(v).

Hence, d(v)d(v) is less than or equal to the length of any path from ss to vv, which means that d(v)d(v) gives the length of a shortest path from ss to vv. So the induction step is complete for (1)(1).

Induction step (2): Consider a node wvw \neq v which is not in SS. Let yy be the last vertex in SS encountered on a shortest special path from ss to ww. We consider:

  • yvy \neq v
  • y=vy = v

If yvy \neq v then adding vv to SS does not change the value of d(w)d(w). Hence, by the induction hypothesis (2), d(w)d(w) gives the length of a shortest special path from ss to ww.

If y=vy = v, then by doing d(w)=min(d(w),d(v)+wt(v,w))d(w) = \mathrm{min} (d(w), d(v) + wt(v, w)), we ensure that after adding vv to SS, d(w)d(w) gives the length of a shortest special path from ss to ww.